Skip to content

Fix safe-outputs YAML corruption when OTLP header masking is enabled#37068

Open
Copilot wants to merge 5 commits into
mainfrom
copilot/fix-invalid-yaml-outputs-job
Open

Fix safe-outputs YAML corruption when OTLP header masking is enabled#37068
Copilot wants to merge 5 commits into
mainfrom
copilot/fix-invalid-yaml-outputs-job

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Jun 5, 2026

observability.otlp.headers could produce invalid workflow YAML in safe_outputs when GitHub App token minting is also present. The app-token step insertion could land inside another step's multi-line block, causing run: content to be emitted at with: indentation.

  • Problem scope

    • Header-triggered OTLP masking adds setup-time steps in safe_outputs.
    • Attribute-triggered OTLP masking (GH_AW_OTLP_ATTRIBUTES) similarly shifts the insertion point.
    • Existing app-token insertion index logic did not reliably align to step boundaries under these compositions.
  • Compiler changes

    • Updated safe_outputs app-token insertion index calculation to account for OTLP mask step injection, deriving the offset dynamically from the actual helper output (strings.Count) so the arithmetic stays self-consistent if either helper grows to emit more than one step.
    • Added a step-boundary realignment guard before insertion:
      • If computed index points into the middle of a serialized step block, advance to the next - name: boundary.
    • Added a warning log when the boundary-scan loop exhausts the steps slice, making miscalculations visible in production logs.
    • Result: token step insertion no longer fragments neighboring YAML blocks.
  • Regression coverage

    • Added focused regression test for safe-outputs.github-app + OTLP headers path (OTEL_EXPORTER_OTLP_HEADERS).
    • Added regression test for the GH_AW_OTLP_ATTRIBUTES-only path.
    • Added regression test for the combined path where both env vars are present (highest-risk composition — insertIndex incremented twice).
    • All three tests assert the setup-agent-output run block remains intact and is not relocated under a with: block.
for insertIndex < len(steps) && !strings.HasPrefix(steps[insertIndex], stepNameLinePrefix) {
	insertIndex++
}
if insertIndex == len(steps) {
	consolidatedSafeOutputsJobLog.Printf(
		"WARN: app-token insertion reached end of steps slice (len=%d); step ordering may be incorrect",
		len(steps),
	)
}

Copilot AI and others added 3 commits June 5, 2026 06:05
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix invalid YAML generation in safe-outputs job with OTLP headers Fix safe-outputs YAML corruption when OTLP header masking is enabled Jun 5, 2026
Copilot AI requested a review from pelikhan June 5, 2026 06:08
@pelikhan pelikhan marked this pull request as ready for review June 5, 2026 06:13
Copilot AI review requested due to automatic review settings June 5, 2026 06:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a workflow YAML corruption issue in the consolidated safe_outputs job when both OTLP header masking steps and GitHub App token minting are enabled. The corruption was caused by inserting the app-token step at a slice index that could fall inside another step’s multi-line block, producing invalid indentation (e.g., run: content ending up under a with: block).

Changes:

  • Adjusts the GitHub App token insertion index calculation to account for OTLP header/attribute masking step injection.
  • Adds a step-boundary realignment guard to ensure insertion occurs at the next canonical step start line.
  • Adds a regression test covering the GitHub App + OTLP headers composition and asserting the setup-agent-output-env run: | block remains intact.
Show a summary per file
File Description
pkg/workflow/compiler_safe_outputs_job.go Updates app-token insertion index logic to include OTLP mask steps and realigns insertion to step boundaries to prevent YAML block fragmentation.
pkg/workflow/compiler_safe_outputs_job_test.go Adds a focused regression test that reproduces the OTLP-headers masking + GitHub App insertion scenario and checks for the prior YAML corruption pattern.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 0

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

@copilot review all comments and address unresolved review feedback.

Generated by 👨‍🍳 PR Sous Chef · 78.8 AIC ·

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

@copilot refresh the branch and rerun checks, then summarize any remaining blockers.

Generated by 👨‍🍳 PR Sous Chef · 78.8 AIC ·

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

PR Code Quality Reviewer completed the code quality review.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

🧪 Test Quality Sentinel completed test quality analysis.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #37068 does not have the 'implementation' label (has_implementation_label=false) and has 63 new lines of code in business logic directories, which is at or below the 100-line threshold (requires_adr_by_default_volume=false).

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /diagnose and /tdd — commenting with non-blocking suggestions to improve test coverage and make the index arithmetic more resilient.

📋 Key Themes & Highlights

Key Themes

  • Test coverage gaps: The regression test covers OTEL_EXPORTER_OTLP_HEADERS but not GH_AW_OTLP_ATTRIBUTES, and no test exercises both simultaneously.
  • Fragile offset arithmetic: insertIndex++ hardcodes a step-count assumption that the while-loop guard compensates for but does not remove.
  • Silent failure mode: The boundary-scan loop has no log or error when it exhausts the slice, making a miscalculation invisible in production.

Positive Highlights

  • Two-layer defence: explicit offset increments + a while-loop boundary guard is a well-structured approach — the guard provides a real safety net even when the arithmetic drifts.
  • Focused regression test: TestGitHubAppTokenStepWithOTLPHeaders targets the exact corruption signature (with:\n echo ...) rather than a generic golden-file comparison. Easy to understand and maintain.
  • stepNameLinePrefix as a named constant: elevating the sentinel to a package constant with a doc-comment is the right call; it makes the intent clear and is a single point of change.
  • Clear PR description: root cause, fix strategy, and regression coverage are all explained concisely. Great template for future compiler bug fixes.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · sonnet46 1.7M · 245.9 AIC

workflowData := &WorkflowData{
Name: "Test Workflow",
Env: `"env":
OTEL_EXPORTER_OTLP_HEADERS: "Authorization=******"`,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The regression test only exercises the isOTLPHeadersPresent path (OTEL_EXPORTER_OTLP_HEADERS). The isOTLPAttributesPresent path (GH_AW_OTLP_ATTRIBUTES) has no coverage — but the fix also increments insertIndex for it.

💡 Suggested addition

Add a second test (or extend this one) with GH_AW_OTLP_ATTRIBUTES set in Env, repeating the same ordering assertions. Then add a third scenario with both env vars set to confirm insertIndex is incremented twice correctly.

Env: `"env":
  OTEL_EXPORTER_OTLP_HEADERS: "Authorization=******"
  GH_AW_OTLP_ATTRIBUTES: "{\"service.name\":\"test\"}"`

Without this, a regression in the attributes path would go undetected.

assert.Less(t, setupEnvEchoIndex, tokenIndex, "app token step must be inserted after setup-agent-output-env run block")
// Regression guard: app token insertion must not split setup-agent-output-env and leave
// an `echo ...` run line nested under a previous `with:` block.
assert.NotContains(t, stepsContent, "with:\n echo \"GH_AW_AGENT_OUTPUT=/tmp/gh-aw/agent_output.json\" >> \"$GITHUB_OUTPUT\"")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] No test exercises the combined scenario where both OTEL_EXPORTER_OTLP_HEADERS and GH_AW_OTLP_ATTRIBUTES are present at the same time. That is the highest-risk composition — insertIndex is incremented twice — yet it has no regression guard.

💡 Why this matters

If one of the two insertIndex++ increments were accidentally removed or a mask step were to emit more than one entry, the combined path would silently corrupt YAML in a way that the single-variable tests would not detect. A dedicated combined test would catch that class of regression immediately.

Comment on lines +455 to +460
if isOTLPHeadersPresent(data) {
insertIndex++
}
if isOTLPAttributesPresent(data) {
insertIndex++
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/diagnose] The insertIndex++ increments are load-bearing: each assumes the corresponding mask helper appends exactly one string entry to the steps slice. The comment documents this ("Each mask helper currently appends one string entry"), but if generateOTLPHeadersMaskStep() or generateOTLPAttributesMaskStep() ever grow to emit two entries (e.g. an added env-echo step), the arithmetic silently undercounts and the while-loop guard must compensate.

💡 More robust alternative

The offset could be derived from the actual number of entries appended, keeping the arithmetic self-consistent regardless of how many lines each helper emits:

if isOTLPHeadersPresent(data) {
    mask := generateOTLPHeadersMaskStep()
    // len == 1 today, but this survives future growth
    insertIndex += strings.Count(mask, "      - name: ")
}

Or expose a step-count constant alongside each generator. Either approach removes the silent undercounting risk.

Comment on lines +488 to +490
for insertIndex < len(steps) && !strings.HasPrefix(steps[insertIndex], stepNameLinePrefix) {
insertIndex++
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/diagnose] If the loop exhausts steps without finding a stepNameLinePrefix line, insertIndex equals len(steps) and the app-token block is silently appended at the very end — after checkout and safe-output steps that reference steps.safe-outputs-app-token.outputs.token. The YAML would be structurally valid but semantically broken, and there is no log message or error to surface it.

💡 Suggested guard

A short diagnostic log after the loop would make this case visible:

for insertIndex < len(steps) && !strings.HasPrefix(steps[insertIndex], stepNameLinePrefix) {
    insertIndex++
}
if insertIndex == len(steps) {
    consolidatedSafeOutputsJobLog.Printf(
        "WARN: app-token insertion reached end of steps slice (len=%d); step ordering may be incorrect",
        len(steps),
    )
}

This is a rare edge case today, but becomes more likely if future step sequences are ever added that push the final checkout step past the end of the current emission order.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

🧪 Test Quality Sentinel Report

Test Quality Score: 80/100 — Excellent

Analyzed 1 test: 1 design test (behavioral contract), 0 implementation tests, 1 minor guideline note (missing assertion message on assert.NotContains).

📊 Metrics & Test Classification (1 test analyzed)
Metric Value
New/modified tests analyzed 1
✅ Design tests (behavioral contracts) 1 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 1 (100%)
Duplicate test clusters 0
Test inflation detected Yes — 44 test lines / 19 production lines ≈ 2.3:1 (attributable to necessary WorkflowData setup verbosity)
🚨 Coding-guideline violations 1 — missing assertion message on assert.NotContains (minor)

Test Classification Details

Test File Classification Issues Detected
TestGitHubAppTokenStepWithOTLPHeaders pkg/workflow/compiler_safe_outputs_job_test.go:816 ✅ Design Minor: 1 assertion missing a message argument

Language Support

Tests analyzed:

  • 🐹 Go (*_test.go): 1 test — unit (//go:build !integration)
  • 🟨 JavaScript (*.test.cjs, *.test.js): 0 tests
⚠️ Flagged Tests — Minor Issue (1)

⚠️ TestGitHubAppTokenStepWithOTLPHeaders — missing assertion message

Classification: ✅ Design test
Issue: The final assert.NotContains regression guard omits a trailing message argument, making test failure output less informative.

// Current — missing message:
assert.NotContains(t, stepsContent,
    "with:\n          echo \"GH_AW_AGENT_OUTPUT=/tmp/gh-aw/agent_output.json\" >> \"$GITHUB_OUTPUT\"")

// Suggested fix:
assert.NotContains(t, stepsContent,
    "with:\n          echo \"GH_AW_AGENT_OUTPUT=/tmp/gh-aw/agent_output.json\" >> \"$GITHUB_OUTPUT\"",
    "app token step must not split setup-agent-output-env into a nested echo under with:")

This is a minor style issue — the test itself is high-quality and correctly guards the YAML-corruption regression.

Verdict

Check passed. 0% of new tests are implementation tests (threshold: 30%). TestGitHubAppTokenStepWithOTLPHeaders is a well-structured regression guard for an observable compiler-output contract: it verifies that steps in the generated YAML remain in the correct order when OTLP header masking is active, and that the app-token step is not inserted mid-block.

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

  • Assert on observable outputs, return values, or state changes
  • Cover error paths and boundary conditions
  • Would catch a behavioral regression if deleted
  • Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

  • Assert on internal function calls (mocking internals)
  • Only test the happy path with typical inputs
  • Break during legitimate refactoring even when behavior is correct
  • Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References: §26998808540

🧪 Test quality analysis by Test Quality Sentinel · sonnet46 2.5M · 457.4 AIC ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test Quality Sentinel: 80/100. Test quality is acceptable. 0% of new tests are implementation tests (threshold: 30%). The new test TestGitHubAppTokenStepWithOTLPHeaders is a well-structured regression guard for an observable compiler-output contract. Minor note: the final assert.NotContains call is missing a descriptive message argument (see comment above).

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

@copilot review all comments and address unresolved review feedback.

Generated by 👨‍🍳 PR Sous Chef · 42.2 AIC ·

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Please refresh the branch, rerun checks, and summarize any remaining blockers.

Generated by 👨‍🍳 PR Sous Chef · 42.2 AIC ·

@pelikhan
Copy link
Copy Markdown
Collaborator

pelikhan commented Jun 5, 2026

@copilot apply copilot-revie skill

…ounting

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Jun 5, 2026

@copilot apply copilot-revie skill

Applied the copilot-review skill and addressed all four open review threads. Changes in the latest commit:

  • Tests (PRRT_kwDOPc1QR86HV6as) — Added TestGitHubAppTokenStepWithOTLPAttributes covering the GH_AW_OTLP_ATTRIBUTES-only path.
  • Tests (PRRT_kwDOPc1QR86HV6az) — Added TestGitHubAppTokenStepWithOTLPHeadersAndAttributes for the combined OTEL_EXPORTER_OTLP_HEADERS + GH_AW_OTLP_ATTRIBUTES scenario (the path where insertIndex is incremented twice).
  • Robustness (PRRT_kwDOPc1QR86HV6a0) — Replaced the hardcoded insertIndex++ increments with strings.Count(generateOTLP*MaskStep(), stepNameLinePrefix) so the offset is derived from the actual helper output and stays self-consistent if either helper ever grows to emit more than one step.
  • Diagnostics (PRRT_kwDOPc1QR86HV6a1) — Added a consolidatedSafeOutputsJobLog.Printf("WARN: ...") after the boundary-scan loop to surface the exhaustion case.
  • Style — Added the missing descriptive message argument to assert.NotContains in TestGitHubAppTokenStepWithOTLPHeaders.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

observability.otlp with headers produces invalid YAML in safe-outputs job (v0.74.3, v0.77.5)

3 participants