Diagnostics ladder
When a Loom run fails, follow this sequence from the smallest pointer to the narrowest evidence. The goal is not "open the biggest log file" but "open the exact phase or step that failed."
Use this when
| Situation | Start with |
|---|---|
| You have a failed run and want the fastest path to evidence | The receipt path Loom printed |
| A teammate shared a run directory | .loom/.runtime/logs/<run_id>/ |
| You suspect provider or contract behavior | The receipt, then phase_report_path if present |
Bring one of these
- The receipt path Loom printed, for example
receipt: /absolute/path/to/repo/.loom/.runtime/receipts/loom-run-local-....json - The runtime logs directory,
.loom/.runtime/logs/<run_id>/
If you have the receipt, you also have the optional phase_report_path pointer for runtime-contract validation.
Decision path
Quick symptom map
| Symptom | Open this file first |
|---|---|
| The whole run failed and you need the first failing job | pipeline/manifest.json |
| You already know the failing job | jobs/<job_id>/manifest.json |
| The job failed before user script execution | jobs/<job_id>/system/<section>/events.jsonl via the job manifest |
| The job failed in a user step | jobs/<job_id>/user/execution/script/<NN>/events.jsonl via the job manifest |
Step 1 — Receipt: confirm the run and the pointers
Open: the receipt JSON Loom printed.
Inspect these fields:
| Field | Why it matters |
|---|---|
status, exit_code | Tells you whether the run failed |
logs_dir | Root pointer for the ladder below |
phase_report_path | Optional pointer for phase validation and coverage |
Go next: open pipeline/summary.json or pipeline/manifest.json under logs_dir.
If the run succeeded but the outcome is still wrong, the receipt still gives you the correct run root and phase report.
Step 2 — Pipeline summary: did the pipeline fail?
Open: pipeline/summary.json
{
"schema_version": "loom.runtime.logs.v2",
"run_id": "loom-run-local-1772865600000000000",
"pipeline_id": "loom-local-1772865600000000000",
"status": "failure",
"exit_code": 1,
"duration_ms": 155557,
"error": "job \"check-pnpm\" failed"
}
| Field | What it tells you |
|---|---|
status | success or failure |
exit_code | Pipeline exit code |
error | Top-level pipeline error when Loom has one |
Go next: if the pipeline failed, open pipeline/manifest.json.
If status is success, stop using the failure ladder and switch to output validation or artifact inspection instead.
Step 3 — Pipeline manifest: which job should you inspect first?
Open: pipeline/manifest.json
{
"schema_version": "loom.runtime.logs.v2",
"status": "failure",
"exit_code": 1,
"failing_job_id": "check-pnpm",
"failing_job_manifest_path": "jobs/check-pnpm/manifest.json",
"jobs": [
{
"job_id": "check-pnpm",
"status": "failed",
"job_manifest_path": "jobs/check-pnpm/manifest.json",
"job_summary_path": "jobs/check-pnpm/summary.json",
"system_events_path": "jobs/check-pnpm/system/provider/events.jsonl",
"artifacts_path": "jobs/check-pnpm/artifacts"
}
]
}
| Field | What it tells you |
|---|---|
failing_job_id | The first failing job |
failing_job_manifest_path | The next file to open |
jobs[] | The full job roster, including artifact pointers |
Go next: open failing_job_manifest_path.
Start with the first failing job unless you already know the issue is global.
Step 4 — Job manifest: user step or system section?
Open: jobs/<job_id>/manifest.json
This is the main branching point.
User-step failure
If failing_step_events_path is present, your failure is in user execution:
{
"job_id": "check-pnpm",
"status": "failed",
"failing_section": "script",
"failing_step_index": 2,
"failing_step_events_path": "jobs/check-pnpm/user/execution/script/02/events.jsonl",
"user_steps": [
{
"step_id": "script-02",
"command_preview": "pnpm install --frozen-lockfile",
"step_events_path": "jobs/check-pnpm/user/execution/script/02/events.jsonl"
}
]
}
Go next: open the pointed step events file.
System-section failure
If failing_step_events_path is absent, the job failed in provider, cache, artifact, or other system work:
{
"job_id": "build-image",
"status": "failed",
"failing_section": "provider",
"system_sections": [
{
"system_section": "provider",
"phase_code": "job.provider_prepare",
"events_path": "jobs/build-image/system/provider/events.jsonl"
},
{
"system_section": "cache_restore",
"phase_code": "job.cache_restore",
"events_path": "jobs/build-image/system/cache_restore/events.jsonl"
}
]
}
Use failing_section to choose the matching events_path.
Step 5 — Read the pointed events.jsonl
Open: the exact events file from Step 4.
The current runtime contract uses phase_start, output, and phase_finish records:
{"schema_version":"loom.runtime.logs.v2","ts":"2026-03-07T12:34:56Z","seq":14,"level":"info","event":"output","scope":"step","phase_code":"execution.script","phase_family":"user","stream":"stderr","message":"ERR! Missing lockfile entry for @docusaurus/core"}
{"schema_version":"loom.runtime.logs.v2","ts":"2026-03-07T12:34:56Z","seq":15,"level":"error","event":"phase_finish","scope":"step","phase_code":"execution.script","phase_family":"user","status":"failed","exit_code":1,"duration_ms":942}
| Field | What to look for |
|---|---|
phase_code | Which phase failed |
message | The actual stderr or stdout line on output events |
status, exit_code | The closing outcome on phase_finish |
metrics | Skip and telemetry detail for cache or artifact sections |
Go next: only widen to summaries or the phase report if this pointed file still does not explain the problem.
Look for the final phase_finish first, then read the preceding output events for the error text.
Optional Step 6 — Phase report: validate the timeline
Open: phase-report.json or follow phase_report_path from the receipt.
Use it when the main failure ladder is not enough:
| Question | Why the phase report helps |
|---|---|
| "Did Loom emit all required phase boundaries?" | validation and plan answer that directly |
| "Did cache or artifact phases run?" | phase_metrics and plan.requirements show them |
| "Was work ordered correctly?" | Validation issues include ordering failures |
| "How much runtime was attributed?" | coverage reports attributed vs unattributed runtime |
When to widen scope
Only widen scope after the pointed events file fails to explain the problem.
| Situation | Next move |
|---|---|
| The failing unit still does not explain the error | Read jobs/<job_id>/summary.json and the matching system section summary |
| Multiple jobs failed | Revisit pipeline/manifest.json and inspect the next failed job |
| You suspect a contract bug rather than a task failure | Open phase-report.json |
| You need the exact workspace or revision context | Go back to the receipt |