Same build site. Today three robot models show up to the crew, and they don't all work the same way:
- An off-site workshop unit — you mail it the job ticket, it builds the part back at the factory, then ships you a finished crate to inspect before you bolt it on. You never watch it work; you review the result. (The cloud agent — assign it an issue, it works on its own branch, hands you a PR.)
- A handheld guided tool in your own hands on-site — fast, does exactly what you point it at, right here on your machine. (The CLI agent.)
- A desktop unit at the workbench in the IDE — you drive it in your editor, watch every move. (The app/IDE agent; brands include Copilot, Codex, Claude.)
Same crew, different surfaces. But here's the part that matters for the exam: how much leash you give any of them is not a property of the robot — it's a dial you set per job, based on how much damage a mistake could do. Sweeping the floor? Let it run. Rewiring the main electrical panel? It prepares the work and waits for a human to throw the switch.
That dial is autonomy, and the craft of agent architecture is turning it down where the blast radius (how much could break if it goes wrong) is big — and proving, with real GitHub controls, that the dial can't be turned past where you set it.
Part 1 · The agent variants (surfaces)
An agent isn't one thing. The same "explore → plan → act → evaluate" loop from lesson 1.4 runs on different surfaces, and the surface changes how you interact with it.
| Variant | Where it runs | How you start it | What you get back |
|---|---|---|---|
| Cloud agent (Copilot Cloud Agent) | GitHub's servers | Assign a GitHub issue, or use the Agents tab to generate a plan | A branch + a pull request to review |
| CLI agent | Your local machine / a runner | Run the CLI; pass flags | Edits and actions on your machine, live |
| IDE / app agent | Your editor | Drive it in-editor | Suggestions and edits you watch in real time |
Key facts about the cloud agent (exam-relevant):
- It explores the repo, suggests a plan, edits a branch, and opens a PR — the human reviews the PR. This is the contributor model from lesson 1.3, made concrete.
- Two ways to kick it off: assign an issue to Copilot, or use the Agents tab.
- Not every tool is available on every surface. The
webtool (fetch URLs / web search) and thetodotool (task list) are not supported in the cloud agent today — so "the cloud agent can browse the live web" is a wrong-answer trap.
CLI knobs worth knowing (official-confirmed):
--autopilot— autopilot mode lets the agent continue working autonomously on the local machine until the task is complete, without stopping to ask.--agent=NAME— select a specific custom agent by name (e.g.copilot --agent=refactor-agent).
The gist also lists /delegate (or & prompt) to push a task to the cloud agent as a background job, /fleet to split work into parallel subagents, and --no-ask-user to suppress prompts. The mental model — local-continue vs hand-to-cloud vs fan-out — is useful; those exact flag names beyond --autopilot/--agent aren't in the official pool, so don't bet the exam on them.
Custom agents live in files. An org- or enterprise-scoped custom agent is defined under /agents/ inside a .github-private repository. (The file format — role, model, tools, persona — is lesson 1.7.)
Part 2a · Autonomy by capability (what tools it holds)
This is the heart of the lesson, and a named exam objective — set how much autonomy an agent gets for a task, and build the guardrails to match. The dial is expressed two complementary ways — learn both, the exam tests both. First, by the tools the agent is granted:
| Level | Can do | Tools granted | Controls |
|---|---|---|---|
| Low | read, search, summarize, plan | read, search | no write, no shell |
| Medium | edit files, run tests, open PR | read, search, edit, execute | PR checks + required review |
| High | use MCP (Model Context Protocol — a standard "plug" that lets an agent connect to outside tools, like a USB-C port for AI), modify workflows, coordinate agents | agent, MCP tools, shell | narrow tools, hooks (little scripts that auto-run at set moments to check or block an action), approvals, audit |
In an agent file this is literally a tools: list (lesson 1.7) — low = read, search; medium = + edit, execute; a coordinator = read, search, agent. The bigger the leash, the more enforceable controls you must add alongside it: high autonomy requires permissions, reviews, scans, rulesets, hooks, and logs.
read = read files · search = search the repo (not the web) · edit = write files · execute = run shell · agent = invoke a sub-agent (aliases agent / custom-agent / Task) · web = web fetch (not in cloud agent) · todo = task list (not in cloud agent). MCP = the mechanism for external tools/data.
Part 2b · Autonomy by risk (what the task could break)
The official module ties autonomy to blast radius — different tasks get different rules, not one policy everywhere:
| Risk tier | Example paths | Required control |
|---|---|---|
| Low | docs/, formatting | automerge after required checks (and reviews, if configured) |
| Medium | src/, dependency bumps | PR + checks + at least one review |
| High | infra/, .github/workflows/ | CODEOWNERS + multiple reviews + stricter rulesets |
| Critical | production deploys, secrets | environment approvals — the agent prepares but cannot execute |
An environment with required reviewers pauses the job until a human approves. That's why the exam's model answer for "risk-based autonomy for production" is literally "use GitHub Environments with required reviewers" — not "let it deploy after tests pass," not "allow direct pushes to main."
Scenario → level (a classic question shape):
| Task | Setting |
|---|---|
| Summarize repo conventions | low (read-only) |
| Add tests for existing code | medium |
| Modify a deployment workflow | high control, low initial autonomy |
| Change production rollout behavior | human approval required |
| (study guide suggests) Use Jira/Sentry for diagnosis | medium/high, with a narrowly scoped MCP tool |
The three control boundaries
When you configure an agent, the official material says you're setting three boundaries:
- Capability boundary — which tools are allowed. Prefer allowlists. Read-only for planning/review agents; write tools only for execution agents.
- Visibility boundary — whether the agent is user-selectable in the interactive UI.
- Delegation boundary — which subagents it can invoke and how handoffs work.
And the rule that ties back to lesson 1.5: changing a tool allowlist is a governance-sensitive change (a change risky enough to need extra human review) — treat it like a security edit, not a config tweak.
An agent's real power ≈ what its workflow token and tool credentials can do. The dial isn't the prompt — it's the permissions. And: don't over-gate — keep work moving by not adding approvals that don't actually reduce risk. More human approvals ≠ safer; pointless approvals just cost speed. Put the gate where the risk is, nowhere else.
This all sits on the layered control model from lesson 1.5: instructions guide · tool lists limit · hooks intercept · workflows validate · rulesets/branch protection enforce · audit logs record. Autonomy levels are how you set the dial; those layers are how you enforce that it can't be turned past your setting.
The cert-language version
Agents run on different surfaces — a cloud agent (assigned an issue or driven from the Agents tab; explores, plans, branches, opens a PR), a CLI agent (local, with knobs like
--autopilotand--agent=NAME), and IDE/app agents (Copilot, Codex, Claude). The surface doesn't set the leash. Autonomy is a per-task dial, expressed two ways: by capability (Low =read,search; Medium =+edit,executewith PR+review; High =+agent,MCP,shellwith hooks/approvals/audit) and by risk tier (docs→automerge; src→PR+review; infra/workflows→CODEOWNERS+rulesets; production/secrets→environment approvals where the agent prepares but a human executes). Configure it through three boundaries — capability, visibility, delegation — preferring read-only allowlists for planners; remember an agent's power equals its token + credential scope; and don't add approvals that don't reduce real risk.Our summary · grounded in MS Learn — Foundations of Agentic AI (units 1, 4) + Designing Agent Architecture & SDLC Integration (units 1, 4, 6–9) · fetched 2026-05-31
Common confusions (read these or lose points)
- "A cloud agent is more autonomous than a CLI agent." No — autonomy is a dial you set, not a property of the surface. A cloud agent limited to
read, searchis less autonomous than a CLI agent withedit, execute. - "For production, let the agent deploy once tests pass." No. Critical tasks need environment approvals — agent prepares, human executes. "Deploy after tests" and "allow direct pushes to main" are the canonical wrong answers.
- "
searchlets the agent browse the web." No —searchis repository search.webis web fetch, andweb(andtodo) are not available in the cloud agent today. - "More approvals = safer." No — minimize approvals that don't reduce real risk; needless gates only cost velocity.
- "High autonomy just means trusting the model more." No — it means more enforceable controls: narrow tools, hooks, approvals, audit. The model isn't the safeguard; the architecture is.
- "Tweaking the tool list is just config." No — a tool-allowlist change is a governance-sensitive change; review it like a security edit.
- (Study guide suggests) the exact CLI taxonomy
/delegate·& prompt·/fleet·--no-ask-userand the Jira/Sentry→narrow-MCP scenario are gist framing, not confirmed in the official pool — learn the concepts, hold the exact strings loosely.
Ticks this lesson done on the home roadmap. Saved in this browser.