support background steps#4416
Conversation
There was a problem hiding this comment.
Pull request overview
Adds first-class “background step” support to the runner by allowing action steps to run concurrently, introducing control steps (wait / wait-all / cancel), and surfacing related metadata to the Results service contracts.
Changes:
- Introduces new pipeline step types (Wait, WaitAll, Cancel) and runner step implementations to coordinate background execution.
- Extends
StepsRunnerto start background actions concurrently, provide wait/cancel semantics, and attempt to isolate per-step GitHub context. - Adds Results/RunService contract fields and L0 tests to validate concurrency and metadata propagation.
Show a summary per file
| File | Description |
|---|---|
| src/Test/L0/Worker/BackgroundStepsL0.cs | Adds L0 coverage for concurrent background steps, waits, cancels, and steps-context thread-safety. |
| src/Sdk/WebApi/WebApi/ResultsHttpClient.cs | Adds mapping of timeline record variables into workflow step payloads (but also includes debug file logging). |
| src/Sdk/WebApi/WebApi/Contracts.cs | Extends Results service Step contract with background/control-step metadata DTOs. |
| src/Sdk/RSWebApi/Contracts/StepResult.cs | Extends RunService step result contract with background/control-step metadata fields. |
| src/Sdk/DTPipelines/Pipelines/WaitStep.cs | Adds pipeline model for a “wait” step. |
| src/Sdk/DTPipelines/Pipelines/WaitAllStep.cs | Adds pipeline model for a “wait-all” step. |
| src/Sdk/DTPipelines/Pipelines/CancelStep.cs | Adds pipeline model for a “cancel” step. |
| src/Sdk/DTPipelines/Pipelines/StepConverter.cs | Enables JSON deserialization for new step types. |
| src/Sdk/DTPipelines/Pipelines/Step.cs | Registers new known step types and extends the StepType enum. |
| src/Sdk/DTPipelines/Pipelines/ActionStep.cs | Adds Background flag and ensures it’s cloned. |
| src/Runner.Worker/WaitStepRunner.cs | Adds runner step type for “wait” control step. |
| src/Runner.Worker/WaitAllStepRunner.cs | Adds runner step type for “wait-all” control step. |
| src/Runner.Worker/CancelStepRunner.cs | Adds runner step type for “cancel” control step. |
| src/Runner.Worker/StepsRunner.cs | Implements background execution, wait/wait-all/cancel handling, slot limiting, and GitHubContext isolation. |
| src/Runner.Worker/StepsContext.cs | Adds locking around step output/outcome/conclusion mutations. |
| src/Runner.Worker/JobExtension.cs | Wires new pipeline step types into job initialization and sets timeline variables for results metadata. |
| src/Runner.Worker/ExecutionContext.cs | Adds timeline-record variable setter and maps those variables into StepResult; adjusts template evaluator feature gating. |
| src/Runner.Worker/BackgroundStepContext.cs | Adds per-background-step tracking (task, CTS, result, external id). |
Copilot's findings
Comments suppressed due to low confidence (4)
src/Sdk/WebApi/WebApi/ResultsHttpClient.cs:568
File.AppendAllText("/tmp/bg-steps-debug.log", ...)is executed without a try/catch here. If the path is unavailable (e.g., Windows runner, locked filesystem, permissions), it will throw and can break step updates. Remove this statement or guard it behind safe, platform-appropriate diagnostics.
System.IO.File.AppendAllText("/tmp/bg-steps-debug.log", $"[BG-DEBUG] Result: name={step.Name}, isBackground={step.IsBackground}, stepType={step.StepType}\n");
return step;
src/Sdk/WebApi/WebApi/ResultsHttpClient.cs:644
- Serializing and logging full
StepsUpdateRequestJSON payloads to/tmpcan expose tokens/PII (steps metadata can contain user-controlled values) and adds unbounded I/O in a hot path. Please remove this debug logging or route it through existing trace logging with appropriate redaction and opt-in controls.
// DEBUG: Serialize and log the JSON payload
try
{
var json = Newtonsoft.Json.JsonConvert.SerializeObject(request, Newtonsoft.Json.Formatting.None);
System.IO.File.AppendAllText("/tmp/bg-steps-debug.log", $"[BG-DEBUG] JSON payload: {json}\n");
}
catch (Exception ex)
{
System.IO.File.AppendAllText("/tmp/bg-steps-debug.log", $"[BG-DEBUG] Serialize error: {ex.Message}\n");
}
src/Runner.Worker/StepsRunner.cs:605
- In
HandleWaitAsync, the localcompletedis assigned but never used. With TreatWarningsAsErrors enabled in this repo, this will fail the build. Remove the unused assignment (or use it if intended).
Trace.Info($"Waiting for {tasks.Count} background step(s)...");
var cancelTask = Task.Delay(Timeout.Infinite, cancellationToken);
var completed = await Task.WhenAny(Task.WhenAll(tasks), cancelTask);
if (cancellationToken.IsCancellationRequested)
{
src/Runner.Worker/JobExtension.cs:494
- This sets
cancel_step_idon the cancel step timeline record to the logical step id, but later logic expects to publish the background step's external/timeline id. IfStepsRunnercan't resolve the id, this logical value will be reported upstream. Consider deferring this variable until you can resolve the external id (or always overwrite/clear it during execution).
cancelRunner.ExecutionContext.SetTimelineRecordVariable("step_type", "cancel");
if (!string.IsNullOrEmpty(cancelRunner.CancelStepId))
{
cancelRunner.ExecutionContext.SetTimelineRecordVariable("cancel_step_id", cancelRunner.CancelStepId);
}
- Files reviewed: 18/18 changed files
- Comments generated: 14
|
|
||
| [DataContract] | ||
| [JsonObject(NamingStrategyType = typeof(SnakeCaseNamingStrategy))] | ||
| public class CancelControl |
There was a problem hiding this comment.
do we need a nested struct with only 1 field?
| }; | ||
|
|
||
| // Populate background step metadata from TimelineRecord.Variables | ||
| if (r.Variables.TryGetValue("is_background", out var bgVar) && bgVar.Value == "true") |
There was a problem hiding this comment.
🤔 should we stay away from r.Variables since that might cause problem when the latest runner connect to old GHES?
| var timestamp = DateTime.UtcNow.ToString(Constants.TimestampFormat, CultureInfo.InvariantCulture); | ||
| var stepRecords = records.Where(r => String.Equals(r.RecordType, "Task", StringComparison.Ordinal)); | ||
| var stepRecords = records.Where(r => String.Equals(r.RecordType, "Task", StringComparison.Ordinal)).ToList(); | ||
|
|
| public sealed class GlobalContext | ||
| { | ||
| // Lock for thread-safe access to shared collections during concurrent background step execution | ||
| public readonly object CollectionLock = new object(); |
There was a problem hiding this comment.
instead of using lock, should we change everything to be thread-safe, ex: using concurrecnybag/dictionary/queue, etc.
So we won't have problem for any future code that forgot to add lock
| { | ||
| private static readonly Regex _propertyRegex = new("^[a-zA-Z_][a-zA-Z0-9_]*$", RegexOptions.Compiled); | ||
| private readonly DictionaryContextData _contextData = new(); | ||
| private readonly object _lock = new(); |
There was a problem hiding this comment.
same question here, should we make _contextData thread-safe, so we don't need the lock.
| public sealed class StepsRunner : RunnerService, IStepsRunner | ||
| { | ||
| // Track active background steps | ||
| private const int DefaultMaxBackgroundSteps = 10; |
There was a problem hiding this comment.
do we still need a default if we always sends it from the service?
| Cts = bgCts, | ||
| }; | ||
|
|
||
| bgContext.ExecutionTask = Task.Run(async () => |
There was a problem hiding this comment.
we don't have to use Task.Run(async () => since we are in an async task method.
| }; | ||
|
|
||
| // Populate background step metadata from timeline record variables | ||
| if (_record.Variables.TryGetValue("is_background", out var bgVar) && bgVar.Value == "true") |
There was a problem hiding this comment.
Are timeline record variables used internally to pass info around within the runner?
I'm wondering if this information is available in a first class way internally (e.g. instance step instance type), rather than grabbing from a property bag here.
| void SetTimeout(TimeSpan? timeout); | ||
|
|
||
| // Background step output deferral | ||
| Dictionary<string, string> DeferredOutputs { get; set; } |
There was a problem hiding this comment.
Properties are sorted before methods.
This interface is huge. I'm wondering if it would be good to organize all the new "Deferred" properties together as one visual block, and all the new methods "Flush" methods together as one visual block.
| } | ||
| } | ||
|
|
||
| public Dictionary<string, string> DeferredOutputs { get; set; } |
There was a problem hiding this comment.
Are all the other properties sorted above methods?
Summary
Adds runner support for background steps - enabling concurrent step execution within a single GitHub Actions job. This introduces four new workflow YAML keywords:
background: true,wait,wait-all, andcancel.Motivation
Today, all steps in a job run sequentially. Common patterns like "start a dev server, run tests, stop the server" or "upload artifacts while tests continue" require workarounds (
&backgrounding, service containers). Background steps provide first-class support for concurrent step execution with explicit synchronization primitives.What's Changed
New step types (SDK layer)
WaitStep,WaitAllStep,CancelStep— new pipeline step types with JSON deserialization supportActionStep.Background— boolean property indicating a step should run asynchronouslyCore execution engine (
StepsRunner.cs)Task.Runand execute concurrently with subsequent foreground stepswait: <id>blocks until specific background step(s) completewait-all: trueblocks until all prior background steps completecancel: <id>sends SIGTERM with a 7.5s grace periodSemaphoreSlimwait-allinjected before post-job hooks if any background steps haven't been explicitly waited on (visible in UI as a timeline entry)Thread safety (
StepsContext.cs)lockaroundSetOutput,SetConclusion,SetOutcome— background steps write to the shared steps context from thread pool threadsRelated to
https://github.com/github/actions-runtime/issues/5420
https://github.com/github/actions-runtime/issues/5421