fix(ssh): fail exec/script/copy methods on non-zero exit (#510) #22
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "worktree-510"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
exec,script, andcopynow fail the method when any host process exits non-zero, is killed by a signal, or fails to spawn. Previously these methods always reportedsucceededeven when the underlyingssh/scpprocess failed — silently allowing downstream workflow steps gated ondependsOn: { condition: succeeded }to proceed past broken operations.What changed
ssh/extensions/models/_lib/operations.tsthrowOnHostFailures()helper; wired intorunExec,runScript,runCopyafter resource writesssh/extensions/models/ssh_test.tsssh/extensions/models/ssh.ts2026.05.29.2→2026.06.01.1, upgrades entryssh/manifest.yamlssh/README.mdWhy this is correct
The
runForwardmethod (operations.ts:744) already throws on non-zero exit —exec,script, andcopywere the outliers. This aligns all command-execution methods with the same contract: a method that returns normally =succeeded; a method that throws =failed. Swamp has no partial-success status, so the throw is the only mechanism to propagate failure to workflow dependency conditions.Impact
Workflows that previously relied on
exec/script/copyalways succeeding will now see method failures when the underlying process fails. This is the intended fix — the issue reporter's build pipeline silently skipped a file copy and proceeded to configure a VM with missing data because thecopymethod masked thescpexit code 255.Verification
Verified end-to-end against a local
@swamp/sshinstance:RunResult resource confirmed persisted with
exitCode: 255and full stderr.Test plan
deno test)deno check,deno lint,deno fmt --checkcleandeno install --frozen— no lockfile driftfailedon non-zero exitCloses #510
Co-authored-by: stateless
Co-authored-by: Claude Opus 4.6 (1M context) noreply@anthropic.com
The exec, script, and copy methods in @swamp/ssh silently reported "succeeded" even when the underlying ssh/scp process exited non-zero. RunResult resources correctly recorded the exit code and stderr, but the method status was never checked — so workflows gated on `dependsOn: { condition: succeeded }` proceeded past failed operations. Add a shared `throwOnHostFailures` check that runs after all RunResult resources are written. It inspects results for genuine process failures (non-zero exit code, signal kill, spawn errors) while excluding fail-fast "skipped" entries. On failure, it throws an aggregate error naming each failed host with its exit code and last stderr line. Verified end-to-end: `swamp model method run ... exec` now returns `method_execution_failed` when ssh exits 255, and the RunResult resource is still persisted for diagnostics. Co-Authored-By: stateless <stateless@users.noreply.swamp.club> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>Code Review
Blocking Issues
None.
Suggestions
No test for the signal-kill failure path (operations.ts line 224). throwOnHostFailures handles three failure conditions: non-zero exit, signal kill, and spawn error. Every test executor in ssh_test.ts returns signal null. The signal branch is structurally correct, but a test with a CommandExecutor that resolves with a non-null signal string would give full branch coverage of the new code.
Stderr surfaces in the thrown error message (operations.ts line 233). Including the last line of remote stderr in the method-level error is the right call for operational visibility. No credentials-leak risk: passwords travel via SSHPASS env var, never through stderr. Worth noting as a deliberate design choice rather than an oversight.