Last updated: 2026-06-12

CUDA Runtime

Command

All API Sections

CUDA command wrapper — kernel dispatch and stream/event synchronization (mirrors src/metal/command.zig).

A CudaCommand wraps the context's CUstream plus a per-command CUevent; `commitAsync`/`wait`/`releaseCompleted` give the same overlap the Metal backend gets, backed by CUDA streams + events.

2 exports 6 methods src/cuda/command.zig

2 exports shown

struct

CudaCommand

#
pub const CudaCommand = struct

A recorded stream batch that launches compute kernels on the GPU.

src/cuda/command.zig:12

Methods

6

method

CudaCommand.dispatch

#
pub fn dispatch( self: *CudaCommand, pipe: *const CudaPipeline, grid: [3]u32, block: [3]u32, bufs: []const *const CudaBuffer, push_data: ?*const anyopaque, push_size: usize, shared_bytes: u32, ) void

Launch a kernel on this command's stream.

Bound `bufs` become the leading device-pointer args (capped at 32); `push_data` (push_size bytes) is the trailing by-value push-constant arg.

Parameters
pipe
Compiled CUDA pipeline (kernel + argument layout) to execute.
grid
3-D grid dimensions in thread-blocks (x, y, z).
block
3-D thread-block dimensions (x, y, z).
bufs
Device buffers passed as pointer args before the push constant.
push_data
Pointer to the push-constant struct, or null if unused.
push_size
Byte size of the push-constant struct.
shared_bytes
Dynamic shared memory in bytes to allocate per block.

src/cuda/command.zig:27

method

CudaCommand.barrier

#
pub fn barrier(self: *CudaCommand) void

Same-stream launches are implicitly ordered; no-op for a single stream.

src/cuda/command.zig:60

method

CudaCommand.commitAndWait

#
pub fn commitAndWait(self: *CudaCommand) void

Record completion and block until the stream drains.

src/cuda/command.zig:65

method

CudaCommand.commitAsync

#
pub fn commitAsync(self: *CudaCommand) void

Record completion and return immediately; call `wait` later to sync.

src/cuda/command.zig:73

method

CudaCommand.wait

#
pub fn wait(self: *CudaCommand) void

Block on a previously async-committed command's completion event.

src/cuda/command.zig:81

method

CudaCommand.releaseCompleted

#
pub fn releaseCompleted(self: *CudaCommand) void

Release a command whose completion is guaranteed by a later queue-ordered synchronization (e.g.

the caller waited on a subsequent command in the same stream). Frees the shim handle without issuing another wait.

src/cuda/command.zig:91

function

beginCommand

#
pub fn beginCommand(ctx: ?*shim.CudaCtx) !CudaCommand

Begin a new command (stream batch + completion event) on the given context.

the underlying stream/event resources.

Parameters

ctx
Active CUDA context that owns the stream; must not be null.

Returns

A fresh `CudaCommand` ready for kernel dispatches.

Notes

Returns `error.CudaCommandFailed` if the shim cannot allocate

src/cuda/command.zig:104