Home › Phase A › Lesson 1.6

PHASE A · LESSON 1.6

Agent variants & autonomy levels

The same agent loop runs on different surfaces — cloud, CLI, IDE. But the real exam topic is the autonomy dial: how much leash you give a task, set by capability and by risk — and enforced with real GitHub controls.

~10 minread 4quiz questions Tier 1source cited

Story

Same build site. Today three robot models show up to the crew, and they don't all work the same way:

An off-site workshop unit — you mail it the job ticket, it builds the part back at the factory, then ships you a finished crate to inspect before you bolt it on. You never watch it work; you review the result. (The cloud agent — assign it an issue, it works on its own branch, hands you a PR.)
A handheld guided tool in your own hands on-site — fast, does exactly what you point it at, right here on your machine. (The CLI agent.)
A desktop unit at the workbench in the IDE — you drive it in your editor, watch every move. (The app/IDE agent; brands include Copilot, Codex, Claude.)

Same crew, different surfaces. But here's the part that matters for the exam: how much leash you give any of them is not a property of the robot — it's a dial you set per job, based on how much damage a mistake could do. Sweeping the floor? Let it run. Rewiring the main electrical panel? It prepares the work and waits for a human to throw the switch.

That dial is autonomy, and the craft of agent architecture is turning it down where the blast radius (how much could break if it goes wrong) is big — and proving, with real GitHub controls, that the dial can't be turned past where you set it.

Part 1 · The agent variants (surfaces)

An agent isn't one thing. The same "explore → plan → act → evaluate" loop from lesson 1.4 runs on different surfaces, and the surface changes how you interact with it.

Variant	Where it runs	How you start it	What you get back
Cloud agent (Copilot Cloud Agent)	GitHub's servers	Assign a GitHub issue, or use the Agents tab to generate a plan	A branch + a pull request to review
CLI agent	Your local machine / a runner	Run the CLI; pass flags	Edits and actions on your machine, live
IDE / app agent	Your editor	Drive it in-editor	Suggestions and edits you watch in real time

Key facts about the cloud agent (exam-relevant):

It explores the repo, suggests a plan, edits a branch, and opens a PR — the human reviews the PR. This is the contributor model from lesson 1.3, made concrete.
Two ways to kick it off: assign an issue to Copilot, or use the Agents tab.
Not every tool is available on every surface. The web tool (fetch URLs / web search) and the todo tool (task list) are not supported in the cloud agent today — so "the cloud agent can browse the live web" is a wrong-answer trap.

CLI knobs worth knowing (official-confirmed):

--autopilot — autopilot mode lets the agent continue working autonomously on the local machine until the task is complete, without stopping to ask.
--agent=NAME — select a specific custom agent by name (e.g. copilot --agent=refactor-agent).

Study guide suggests — hold the exact strings loosely

The gist also lists /delegate (or & prompt) to push a task to the cloud agent as a background job, /fleet to split work into parallel subagents, and --no-ask-user to suppress prompts. The mental model — local-continue vs hand-to-cloud vs fan-out — is useful; those exact flag names beyond --autopilot/--agent aren't in the official pool, so don't bet the exam on them.

Custom agents live in files. An org- or enterprise-scoped custom agent is defined under /agents/ inside a .github-private repository. (The file format — role, model, tools, persona — is lesson 1.7.)

Part 2a · Autonomy by capability (what tools it holds)

This is the heart of the lesson, and a named exam objective — set how much autonomy an agent gets for a task, and build the guardrails to match. The dial is expressed two complementary ways — learn both, the exam tests both. First, by the tools the agent is granted:

Level	Can do	Tools granted	Controls
Low	read, search, summarize, plan	`read`, `search`	no write, no shell
Medium	edit files, run tests, open PR	`read`, `search`, `edit`, `execute`	PR checks + required review
High	use MCP (Model Context Protocol — a standard "plug" that lets an agent connect to outside tools, like a USB-C port for AI), modify workflows, coordinate agents	`agent`, MCP tools, shell	narrow tools, hooks (little scripts that auto-run at set moments to check or block an action), approvals, audit

In an agent file this is literally a tools: list (lesson 1.7) — low = read, search; medium = + edit, execute; a coordinator = read, search, agent. The bigger the leash, the more enforceable controls you must add alongside it: high autonomy requires permissions, reviews, scans, rulesets, hooks, and logs.

Tool meanings (memorize — they're MCQ distractors)

read = read files · search = search the repo (not the web) · edit = write files · execute = run shell · agent = invoke a sub-agent (aliases agent / custom-agent / Task) · web = web fetch (not in cloud agent) · todo = task list (not in cloud agent). MCP = the mechanism for external tools/data.

Part 2b · Autonomy by risk (what the task could break)

The official module ties autonomy to blast radius — different tasks get different rules, not one policy everywhere:

Risk tier	Example paths	Required control
Low	`docs/`, formatting	automerge after required checks (and reviews, if configured)
Medium	`src/`, dependency bumps	PR + checks + at least one review
High	`infra/`, `.github/workflows/`	CODEOWNERS + multiple reviews + stricter rulesets
Critical	production deploys, secrets	environment approvals — the agent prepares but cannot execute

The enforcement point for the top of the dial: GitHub Environments

An environment with required reviewers pauses the job until a human approves. That's why the exam's model answer for "risk-based autonomy for production" is literally "use GitHub Environments with required reviewers" — not "let it deploy after tests pass," not "allow direct pushes to main."

Scenario → level (a classic question shape):

Task	Setting
Summarize repo conventions	low (read-only)
Add tests for existing code	medium
Modify a deployment workflow	high control, low initial autonomy
Change production rollout behavior	human approval required
(study guide suggests) Use Jira/Sentry for diagnosis	medium/high, with a narrowly scoped MCP tool

The three control boundaries

When you configure an agent, the official material says you're setting three boundaries:

Capability boundary — which tools are allowed. Prefer allowlists. Read-only for planning/review agents; write tools only for execution agents.
Visibility boundary — whether the agent is user-selectable in the interactive UI.
Delegation boundary — which subagents it can invoke and how handoffs work.

And the rule that ties back to lesson 1.5: changing a tool allowlist is a governance-sensitive change (a change risky enough to need extra human review) — treat it like a security edit, not a config tweak.

Two principles that decide the close calls

An agent's real power ≈ what its workflow token and tool credentials can do. The dial isn't the prompt — it's the permissions. And: don't over-gate — keep work moving by not adding approvals that don't actually reduce risk. More human approvals ≠ safer; pointless approvals just cost speed. Put the gate where the risk is, nowhere else.

This all sits on the layered control model from lesson 1.5: instructions guide · tool lists limit · hooks intercept · workflows validate · rulesets/branch protection enforce · audit logs record. Autonomy levels are how you set the dial; those layers are how you enforce that it can't be turned past your setting.

The cert-language version

Agents run on different surfaces — a cloud agent (assigned an issue or driven from the Agents tab; explores, plans, branches, opens a PR), a CLI agent (local, with knobs like --autopilot and --agent=NAME), and IDE/app agents (Copilot, Codex, Claude). The surface doesn't set the leash. Autonomy is a per-task dial, expressed two ways: by capability (Low = read,search; Medium = +edit,execute with PR+review; High = +agent,MCP,shell with hooks/approvals/audit) and by risk tier (docs→automerge; src→PR+review; infra/workflows→CODEOWNERS+rulesets; production/secrets→environment approvals where the agent prepares but a human executes). Configure it through three boundaries — capability, visibility, delegation — preferring read-only allowlists for planners; remember an agent's power equals its token + credential scope; and don't add approvals that don't reduce real risk.

Our summary · grounded in MS Learn — Foundations of Agentic AI (units 1, 4) + Designing Agent Architecture & SDLC Integration (units 1, 4, 6–9) · fetched 2026-05-31

Common confusions (read these or lose points)

"A cloud agent is more autonomous than a CLI agent." No — autonomy is a dial you set, not a property of the surface. A cloud agent limited to read, search is less autonomous than a CLI agent with edit, execute.
"For production, let the agent deploy once tests pass." No. Critical tasks need environment approvals — agent prepares, human executes. "Deploy after tests" and "allow direct pushes to main" are the canonical wrong answers.
"search lets the agent browse the web." No — search is repository search. web is web fetch, and web (and todo) are not available in the cloud agent today.
"More approvals = safer." No — minimize approvals that don't reduce real risk; needless gates only cost velocity.
"High autonomy just means trusting the model more." No — it means more enforceable controls: narrow tools, hooks, approvals, audit. The model isn't the safeguard; the architecture is.
"Tweaking the tool list is just config." No — a tool-allowlist change is a governance-sensitive change; review it like a security edit.
(Study guide suggests) the exact CLI taxonomy /delegate · & prompt · /fleet · --no-ask-user and the Jira/Sentry→narrow-MCP scenario are gist framing, not confirmed in the official pool — learn the concepts, hold the exact strings loosely.

I've finished this lesson — mark it complete

Ticks this lesson done on the home roadmap. Saved in this browser.

Quiz · Lock it in

0 / 0 answered

Q1 · multiple choice

You want an agent to automatically deploy to production once CI is green. What's the correct risk-based control?

Answer · C. Production is a Critical-risk task. The enforcement point is a GitHub Environment with required reviewers, which pauses the job until a human approves — the agent prepares the change but cannot execute it. A and B hand the agent unsafe power; D throws away the automation entirely.

Q2 · multiple choice

Which tool set matches a Medium autonomy level?

Answer · B. Medium = edit files, run tests, open a PR (read, search, edit, execute) with PR checks + required review. A is Low; C is High; D names two tools that aren't even available in the cloud agent.

Q3 · multiple choice

A teammate says "the cloud agent is inherently more autonomous than the CLI agent." Why is that wrong?

Answer · D. Surface (cloud / CLI / IDE) is just where the agent runs. A cloud agent confined to read, search is less autonomous than a CLI agent granted edit, execute. You set the leash with the capability + risk dial, not by picking a surface.

Q4 · explain back

In your own words: name the three agent surfaces, then give the Low/Medium/High autonomy levels by capability, and name the three boundaries an agent profile configures.