Last updated: 2026-06-12

Inference Runtime

Bench Support

All API Sections

Shared helpers for benchmark and standalone runner entrypoints.

This module re-exports the Metal runtime pieces that the benchmark tools need and centralizes the GPU-process-lock error path so the small bench binaries do not duplicate server/runtime boilerplate.

12 exports 0 methods src/bench_support.zig

12 exports shown

constant

metal_device

#
pub const metal_device = @import("metal/device.zig")

Metal device initialization and capability queries (MTLDevice wrapper).

src/bench_support.zig:10

constant

metal_loader

#
pub const metal_loader = @import("model/loader_metal.zig")

Model loader that maps GGUF weights onto Metal buffers.

src/bench_support.zig:12

constant

metal_buffer

#
pub const metal_buffer = @import("metal/buffer.zig")

Metal buffer allocation and management utilities.

src/bench_support.zig:14

constant

metal_command

#
pub const metal_command = @import("metal/command.zig")

Metal command queue and command buffer submission helpers.

src/bench_support.zig:16

constant

kernel_timing

#
pub const kernel_timing = @import("metal/kernel_timing.zig")

Per-kernel Metal dispatch timing probe for profiling individual GPU kernels.

src/bench_support.zig:18

constant

metal_pipeline

#
pub const metal_pipeline = @import("metal/pipeline.zig")

Metal compute pipeline state cache and compilation helpers.

src/bench_support.zig:20

constant

metal_c

#
pub const metal_c = @import("metal/c.zig")

Raw Objective-C/Metal C shim types and bindings.

src/bench_support.zig:22

constant

gguf

#
pub const gguf = @import("model/gguf.zig")

GGUF file parser for reading quantized model weights and metadata.

src/bench_support.zig:24

constant

tokenizer_mod

#
pub const tokenizer_mod = @import("model/tokenizer.zig")

BPE tokenizer encode and decode for text pre/post-processing.

src/bench_support.zig:26

constant

forward_metal

#
pub const forward_metal = @import("compute/forward_metal.zig")

Metal forward-pass runtime that runs the full model inference graph.

src/bench_support.zig:28

constant

process_lock

#
pub const process_lock = @import("gpu/process_lock.zig")

Cross-process GPU ownership lock preventing two zinc processes from sharing a GPU.

src/bench_support.zig:30

function

reportGpuProcessLockError

#
pub fn reportGpuProcessLockError(err: anyerror, backend: process_lock.Backend, device_index: u32) noreturn

Log a user-facing GPU-process-lock error and terminate the benchmark binary.

Prints a human-readable message to stderr explaining why the lock could not be acquired, then calls `std.process.exit(1)`.

dedicated "stop the other instance" message; all other errors fall back to a generic failure message.

Parameters

err
The lock-acquisition error; `error.GpuAlreadyReserved` gets a
backend
The GPU backend whose lock failed (used in the log message).
device_index
Index of the GPU device that could not be locked (used in the log message).

src/bench_support.zig:42