You're back in the car with a driver from lesson 1.2 — that's the agent. This lesson is the sat-nav running in the dashboard.
You type a destination: "the airport." The sat-nav doesn't drive the whole trip blind. It runs a tight loop:
- Plan — it lays out a route. Here are the turns, ETA 25 minutes.
- Act — you drive the next leg. One segment, not the whole journey at once.
- Evaluate — it checks reality against the plan. On the expected road? Traffic? A closure?
Match the plan → keep going. Don't match — accident, road closed — it reroutes: back to plan, drive, check, again. The loop ends only when you've arrived (success) or it says "can't get there — take over" (escalate to a human).
That's an agent on a task: Plan → Act → Evaluate, looping until success or hand-off. The crucial bit — the sat-nav doesn't decide you've arrived because it feels confident; it checks the actual GPS signal. An agent must judge by real signals (tests, scans, reviews), not its own confidence.
The loop, in plain English
An agent doesn't make one decision and stop. It cycles — three phases, repeated:
1. Plan — the agent interprets the goal and works out the steps. In a good system the plan is not a hidden internal thought — it's a structured, reviewable artifact (a PR description, an issue, a checklist) a human can read. A strong plan states three things:
- Scope — what will change.
- Success criteria — how you'll know it worked.
- Rollback (undo the change to put things back the way they were) / escalation path (escalate = hand the decision to a human) — what happens if it goes wrong.
2. Act — the agent does the work in the repository: creates a branch, commits, opens or updates a PR, responds to review. This is deliberately bounded — everything on a branch and through the PR workflow, never a direct push to the default branch. The branch + PR are the guardrails (lesson 1.3).
3. Evaluate — the agent and its human supervisors judge the result using signals from GitHub: workflow runs and status checks (build/test/lint), code review feedback, and security signals (code scanning/SARIF — a standard file format security scanners use to report findings — secret scanning, dependency alerts).
Evaluation must be grounded in system signals, not the agent's confidence. "It looks done to me" is not evaluation. "Required checks pass and the vulnerability scan is clean" is.
And the phase people forget: evaluation isn't the end. If checks fail or requirements aren't met, the loop continues — revise the plan, adjust the action, re-evaluate — until the outcome is acceptable or it's handed to a human.
| Phase | What happens | GitHub artifact / signal |
|---|---|---|
| Plan | interpret goal → scope, success criteria, rollback | Issue, PR description, Agents tab |
| Act | branch, commits, open/update PR, revise | Branch, commits, pull request |
| Evaluate | judge by objective signals; loop or escalate | Workflow runs, checks, reviews, security scans |
When evaluation is made mandatory — rulesets / branch protection requiring checks to pass before merge — the loop's last step becomes an enforceable gate, not a polite suggestion.
Plan vs execution — and when a human validates
A reliable system keeps three things separate: planning (intent), execution (state change), validation (evidence). The cleanest way to say it: "Planning is reviewable intent. Execution changes state." Separation exists so a human can review intent before accepting impact.
That raises one design question — not "is the work reviewed" (it always is) but when is the human's check relative to the code?
| Option A · Plan-first PR | Option B · Plan + execution (one PR) | |
|---|---|---|
| What | Plan approved before any code | Plan (in description) + code (in commits) together |
| Human validates… | before code exists | before merge (code already written) |
| Best for | High-risk — workflows, infra, auth, production; hard to reverse | Low/medium-risk — speed matters, easily reversible |
| Same GitHub controls? | Yes — checks, CODEOWNERS, branch protection | Yes |
Option A = the builder shows you blueprints and waits for sign-off before lifting a hammer. Option B = they start framing while handing you the sketch — faster, fine for a garden shed, reckless for the foundation. Plan-first for large refactors, security, deployment/workflow, cross-repo, or multi-agent work.
Option B's only extra risk is at the proposal stage — GitHub's merge gates still stop unsafe code from shipping.
The task contract: inputs, outputs, success criteria
Before an agent ever runs, define the task as a contract:
- Inputs — what it needs: the issue/alert, scope boundaries ("changes allowed under
src/and dependency files, notinfra/unless asked"), and hard constraints ("no workflow changes without platform review; no secrets; no direct-to-mainpushes"). - Outputs — what it produces: a plan, a bounded PR, and evidence links (workflow runs).
- Success criteria — how it's judged: required checks pass, the real problem is resolved, scope matches intent, rollback path recorded for risky changes.
Make success reflect the real intent — "vulnerability resolved", not just "tests passed". And define success criteria before you give the agent tools — otherwise it can't know when to stop looping. A workflow can turn a success criterion into a required status check, so the PR can't merge until it passes — success enforced by the system, not assumed by the agent.
Designing for failure (reliability)
Agents will fail — misread tasks, break tests, conflict with existing behaviour. A reliable architecture assumes failure and builds recovery in. Four mechanisms:
- Bounded retries — the agent updates the branch and reruns checks when they fail.
- Escalation — a clear rule: if the same required check fails twice, escalate to a human with a structured summary (what failed, what was tried, what evidence exists, suggested next step).
- Rollback readiness — high-risk changes carry rollback notes and scope limits.
- Least-privilege (only the minimum access it needs) — minimal permissions to shrink the blast radius (how much could break if it goes wrong) (lesson 0.11).
The cert-language version
Agentic systems run a plan → act → evaluate lifecycle — a loop, not a single pass — iterating until success criteria are met or the work is escalated. The plan is a structured, reviewable artifact (scope, success criteria, rollback); action is bounded to branches and PRs; evaluation is grounded in system signals (checks, reviews, security scans), not the agent's confidence. Keep planning, execution, and validation separate; choose plan-first for high-risk work. Define the task as inputs / outputs / success criteria, and design for failure with retries, escalation, rollback, and least privilege.
Our summary · grounded in MS Learn — Foundations of Agentic AI (unit 3) + Designing Agent Architecture & SDLC Integration (units 3, 4, 7) · fetched 2026-05-31
Common confusions (read these or lose points)
- "Evaluate = the agent says it's done." No — evaluation is grounded in system signals (checks, scans, reviews), never the agent's confidence.
- "A plan proves the work is safe." No. A plan is not validation — an agent-generated plan doesn't prove the implementation is safe. Validation is a separate, evidence-based step.
- "Plan-first vs plan+execution = reviewed vs not reviewed." No — both are always reviewed. The only difference is when the human validates relative to the code.
- "CI green = done." Necessary, not sufficient. Success must match the real intent.
- "Evaluation evidence can live in the agent's private logs." No — it belongs in GitHub-native artifacts: workflow runs, checks, uploaded artifacts.
- (Study guide suggests) a well-formed task is bounded, testable, API-compatible, reviewable, with required validation — e.g. "improve the payment service" (weak) → "update retry logic to retry transient failures 3× with backoff, add tests, don't change the public API, open a draft PR with validation output" (strong). Treat the
/plancommand and exact task-table wording as study-guide framing, not official text.
Ticks this lesson done on the home roadmap. Saved in this browser.