AI SDLC Kit
Guide

Operations Phase

How to close an epic, sync context memory, and prepare for the next iteration.

What is the Operations phase?

The Operations phase runs after the human approves the review of an epic. It closes the delivery cycle with three responsibilities:

  1. Deploy preparation β€” what must happen before and after shipping
  2. Observability definition β€” how to know the epic is healthy in production
  3. Context sync β€” updating global memory for future epics

/epic-close <N>

/epic-close

The πŸš€ Ops agent asks for the epic number, then reads:

  • doc-specs/<N>-epic/spec-epic-N.md
  • doc-specs/<N>-epic/PRD.md
  • doc-specs/<N>-epic/decisions-log.md
  • doc-specs/CONTEXT.md

It generates doc-specs/<N>-epic/ops-epic-N.md with:

SectionContents
Deploy preparationRequired env vars, infra dependencies, migration sequence, feature flags, rollback plan
Breaking changesAny changes that affect other services or future epics
ObservabilityCritical logs to monitor, health metrics, alerts to configure
Production validationHow to confirm this epic works correctly in production
Technical debtDebt generated by this epic that must be addressed in future epics
Anomaly patternsWhat distinguishes normal from abnormal behavior for the features delivered
Feedback for future epicsRisks, learnings, and suggested adjustments to epics.md

βœ… HITL: Review ops-epic-N.md before deploying.

  • Is the deploy sequence safe?
  • Are all breaking changes documented?
  • Are anomaly patterns precise enough for on-call monitoring?

Production gate

After reviewing ops-epic-N.md, the production gate is manual β€” no prompt or agent involved:

  1. Merge the epic branch (feat/E<NN>-<slug>) to main
  2. Execute the deploy
  3. Validate in production using the criteria defined in ops-epic-N.md
  4. Only after production validation can the next epic begin

/context-sync <N>

/context-sync

After the production gate passes, the πŸ—οΈ Architect agent reads ops-epic-N.md and decisions-log.md and updates doc-specs/CONTEXT.md with:

  • Summary of the completed epic
  • All ADRs (architectural decision records) from decisions-log.md
  • Technical debt registered by the Ops agent
  • Risks and learnings that affect future epics

CONTEXT.md is the project's long-term memory. Every agent reads it before acting on any epic. Keeping it accurate is what ensures each epic builds on a shared, validated understanding of what was built before.


Incident triage (ongoing)

At any point after an epic ships, if an incident is observed in production:

/ops-triage

The πŸš€ Ops agent:

  1. Asks for the symptom observed and the potentially affected epic
  2. Reads ops-epic-N.md and maps the symptom to documented anomaly patterns
  3. Proposes graduated actions: contain β†’ mitigate β†’ fix β†’ prevent
  4. Records the triage in doc-specs/<N>-epic/incident-log.md
  5. If the incident reveals a gap in the playbook or requires a permanent fix, flags it as technical debt in epics.md

The Ops agent does not execute actions in production β€” it proposes, documents, and closes the learning loop.


The loop for the next epic

/context-sync <N> completes
       ↓
  CONTEXT.md updated
       ↓
  /epic-init <N+1>
       ↓
  Spec phase β†’ Epic phase β†’ Operations phase
       ↓
  Repeat until epics.md is complete

Each time the loop repeats, the Architect agent reads the updated CONTEXT.md, ensuring every subsequent epic is informed by the full history of decisions, debt, and learnings from all previous epics.