Last updated: 2026-06-12
CUDA Runtime
Command
CUDA command wrapper — kernel dispatch and stream/event synchronization (mirrors src/metal/command.zig).
A CudaCommand wraps the context's CUstream plus a per-command CUevent; `commitAsync`/`wait`/`releaseCompleted` give the same overlap the Metal backend gets, backed by CUDA streams + events.
2 exports shown
struct
CudaCommand
pub const CudaCommand = struct A recorded stream batch that launches compute kernels on the GPU.
Methods
6method
CudaCommand.dispatch
pub fn dispatch( self: *CudaCommand, pipe: *const CudaPipeline, grid: [3]u32, block: [3]u32, bufs: []const *const CudaBuffer, push_data: ?*const anyopaque, push_size: usize, shared_bytes: u32, ) void Launch a kernel on this command's stream.
Bound `bufs` become the leading device-pointer args (capped at 32); `push_data` (push_size bytes) is the trailing by-value push-constant arg.
method
CudaCommand.barrier
pub fn barrier(self: *CudaCommand) void Same-stream launches are implicitly ordered; no-op for a single stream.
method
CudaCommand.commitAndWait
pub fn commitAndWait(self: *CudaCommand) void Record completion and block until the stream drains.
method
CudaCommand.commitAsync
pub fn commitAsync(self: *CudaCommand) void Record completion and return immediately; call `wait` later to sync.
method
CudaCommand.wait
pub fn wait(self: *CudaCommand) void Block on a previously async-committed command's completion event.
method
CudaCommand.releaseCompleted
pub fn releaseCompleted(self: *CudaCommand) void Release a command whose completion is guaranteed by a later queue-ordered synchronization (e.g.
the caller waited on a subsequent command in the same stream). Frees the shim handle without issuing another wait.
function
beginCommand
pub fn beginCommand(ctx: ?*shim.CudaCtx) !CudaCommand Begin a new command (stream batch + completion event) on the given context.
the underlying stream/event resources.