Runner
What it does
The Runner is the self-hosted process that executes the work the Server hands out. When a Module Job is dispatched, a Runner picks it up over its SignalR connection, fetches the Module source, materialises Inputs and Extra Files into the working directory, executes the configured engine commands (init, plan, apply, destroy, output), streams logs back to the Server in real time, and reports the final status.
The Runner is where credentials with real blast radius live: cloud provider keys, Kubernetes service accounts, on-prem API tokens. By deploying Runners with narrowly scoped permissions and binding Modules to them via Runner Assignments, you control which Modules can act on which infrastructure. The Server itself never holds these credentials.
The Runner is bundled with Snap CD in both editions and is not license-gated — Cloud-edition customers run their own Runners against snapcd.io, and Self-Hosted customers run them against their own Server.
Prerequisite Server Resources
The following resources must exist on the Server before a Runner can connect and pick up Jobs:
| Resource | Notes |
|---|---|
| Service Principal | The identity the Runner authenticates as. The Runner uses its ClientId and ClientSecret to obtain JWTs from /connect/token |
| Runner record | The Runner binds to this record. Its Id is what the Runner reports via Runner.Id, and its ServicePrincipalId must reference the Service Principal above |
| Runner Assignment | At least one, covering every Module the Runner is to handle. Created at Stack / Namespace / Module scope, or shortcut via is_assigned_to_all_modules on the Runner record |
Other Prerequisites
What must be present on the Runner host or its network:
| Prerequisite | Notes |
|---|---|
| Engine binary | terraform, tofu or pulumi must be present on PATH (or via Engine.AdditionalBinaryPaths) for whichever Engines you intend to use |
| Network egress to the Server | A long-lived outbound HTTPS / Websocket connection to the Server’s /runnerhub. No inbound ports are required on the Runner host |
| Provider credentials | Whatever the Modules running on this Runner need — cloud credentials, kubeconfig, on-prem tokens. Bound to the host (env vars, instance metadata, mounted kubeconfig) |
| Writable working directory | Path for fetched Module source and engine state. Defaults to ~/.snapcd/runner |
Deployment
See Deployment > Guide > Runner for the reference deployment repositories (Docker, Kubernetes, local) and the minimum Compose shape.
Connection model
A Runner connects to the Server’s /runnerhub with a JWT obtained via client_credentials against /connect/token. The connection is long-lived and bidirectional:
- Server → Runner. Job dispatches, cancellation signals, and configuration updates
- Runner → Server. Log envelopes, step status updates, terminal Job results
When the Runner connects, it announces its Runner.Instance name. The Server records the connection in the database. Job dispatch then targets either:
- Any available Runner against this record (the default — first to respond handles the Job)
- A specific Runner, when the Module sets
runner_instance_nameto pin to one by name
Reconnect and outages
The SignalR client reconnects automatically. While disconnected:
- Outgoing log envelopes buffer in-process; on reconnect, the buffer flushes
- The Runner does not pull new Jobs (it can’t — dispatch is push-based)
- Jobs already in-flight continue executing locally; their terminal status posts on reconnect
The Server treats a Runner as offline once its RunnerConnection row is gone. A Runner that crashes hard mid-Job will resume reporting on its next start; the Server reconciles by treating any Job whose owning Runner has disconnected as eligible for the next available one.
Multiple Runners per record
When the Runner record has allow_multiple_instances = true, you can run replicas (for example a Kubernetes StatefulSet with replicas: 3). Job dispatch follows the Runner Selection model — by default the Server broadcasts each Job to all connected Runners and the first to respond handles it. Each replica must report a distinct Runner.Instance name.
Operations & observability
The Runner emits two kinds of logs:
- Runtime / diagnostic logs — standard MEL
ILoggeroutput to stdout, filtered by theLogging.LogLevelsection. These cover Runner startup, connection events, Job pickup and so on - Job logs — the engine output for each Job step, shipped to the Server in batches over the
/runnerhubconnection and visible in the Dashboard’s Jobs view. These are not written to stdout
Job-log shipping is batched: the Runner accumulates events for JobLogStream.PeriodSeconds (default 5) or up to JobLogStream.BatchSizeLimit (default 50), whichever comes first, then ships a single batch. The first event of each batch ships immediately when JobLogStream.EagerlyEmitFirstEvent is true (default), keeping initial Job output responsive on the Dashboard.
The Runner has no built-in dashboard. Operator-facing visibility is the Server’s Dashboard:
- The Runners page shows each Runner record with its connected processes and a live online / offline badge
- The Jobs view shows in-flight Jobs and streams their logs as they arrive
For the Runner host itself, treat it as a standard container workload: stdout to your log aggregator, container metrics to your usual collector.
Settings
The Runner reads its settings from the standard layered pipeline described in Deployment > Settings. Production deployments typically source Runner.Credentials.ClientSecret from a vault via the External Settings provider rather than placing it in plain-text settings.
Generated from the Runner’s published JSON Schema — the same schema operators reference via "$schema" in their appsettings.json to get editor IntelliSense. Click any section to expand its fields.
Engine
object
Discovery hints for the engine binaries (terraform, tofu, pulumi) the Runner invokes per Job. The Runner looks for binaries on PATH first; entries here extend that search.
AdditionalBinaryPaths
array of string
Extra directories prepended to the Runner's binary-search path. Supports leading ~ expansion. Useful when an engine ships in a non-standard location — for example ~/.pulumi/bin for a per-user Pulumi install.
HooksPreapproval
object
Optional content-based allowlist for Hook scripts the Runner is permitted to execute. When enabled, every Hook a Job tries to run must match (by SHA256) a file in the allowlist directory or it is refused. Intended for security-sensitive deployments where the set of shippable Hooks must be reviewed out-of-band.
Enabled
boolean
Enable or disable hook pre-approval validation. When enabled, all incoming hooks must match a pre-approved hook from the PreapprovedHooksDirectory.
PreapprovedHooksDirectory
string
Directory containing pre-approved hook scripts. Each file in this directory is considered a pre-approved hook. File names don't matter - only file content is used for validation.
JobLogStream
object
Tunables for the per-Job log shipping pipeline that streams engine output back to the Server over SignalR. Defaults are sensible for typical workloads; tune BatchSizeLimit and PeriodSeconds together if you need lower per-log latency at the cost of more frequent network round-trips.
BatchSizeLimit
integer
Default: 50
Maximum number of log events to ship in a single batch. The PeriodicBatchingSink will flush early when this size is reached even before PeriodSeconds elapses.
EagerlyEmitFirstEvent
boolean
Default: true
When true, the first event in a fresh batch is emitted immediately rather than waiting for the period or size threshold. Keeps initial job output responsive.
PeriodSeconds
integer
Default: 5
Maximum wall-clock interval, in seconds, between batch flushes. A batch ships whenever either BatchSizeLimit or this period is reached.
Logging
object
Standard .NET Logging configuration. See https://learn.microsoft.com/dotnet/core/extensions/logging-configuration for the full reference. Provider-specific sub-blocks (Console, Debug, EventSource, etc.) are accepted but not enumerated here.
LogLevel
object
Map of log category names (or category prefixes) to minimum log levels. 'Default' applies when no more-specific category matches; longer keys override shorter ones (Microsoft.AspNetCore beats Microsoft beats Default).
Allowed values: Trace, Debug, Information, Warning, Error, Critical, None
Runner
object
Identity, organisation and credentials that bind this Runner process to a Runner record on the Server. All four fields are required for the Runner to authenticate and connect.
Credentials
object
Service Principal credentials the Runner authenticates with. The Service Principal referenced here must be the one bound to the Runner record via service_principal_id.
ClientId
string
The Service Principal's client identifier, prefixed with the Organization ID at the token endpoint (the prefix is added automatically by the Runner; supply only the raw client ID here).
ClientSecret
string
The Service Principal's client secret. Sensitive — production deployments should source this via the External Settings provider rather than committing it to appsettings.json.
Id
string (uuid)
Identifier of the Runner record on the Server this process binds to.
Instance
string
Name this Runner reports when it connects, used to distinguish replicas when allow_multiple_instances is set on the Runner record. Visible in the Dashboard's Runners page next to the parent record.
OrganizationId
string (uuid)
Identifier of the Organization this Runner belongs to. Must match the Organization the Runner record below was created in.
Server
object
Coordinates of the Snap CD Server the Runner connects to.
Url
string
Base URL of the Snap CD Server, including scheme and port. The Runner opens its SignalR connection to {Url}/runnerhub and obtains JWTs from {Url}/connect/token.
WorkingDirectory
object
Filesystem locations the Runner uses for fetched Module source and ephemeral state. Both paths support leading ~ expansion to the host user's home directory.
TempDirectory
string
Directory for ephemeral per-Job scratch space. Cleaned between Jobs. Typically ~/.snapcd/runner/.temp.
WorkingDirectory
string
Root directory under which the Runner persists fetched Module source, engine state and per-Job outputs. Must be writable by the Runner process. Typically ~/.snapcd/runner.
See the Resources area for per-resource semantics (Hooks, Engine, and so on).