The Third Era of Cursor: Transitioning from Code Supervision to Problem Definition

Explore how Cloud Agents redefine human-agent collaboration, shifting the focus from supervision to problem definition and acceptance criteria.

The Third Era of Cursor: Transitioning from Code Supervision to Problem Definition

Core Assertion: Cloud Agents represent not just a technological upgrade but a transformation in the fundamental unit of human-machine collaboration. Humans no longer supervise every Agent session but instead manage problem definitions and acceptance criteria. This aligns with Anthropic’s dual-component Harness design of “Initializer Agent + Coding Agent”, pointing to a conclusion: the core engineering challenge for future Agent systems is not to make Agents faster but to redesign the interaction boundaries between humans and Agents.

Reader Profile: Engineers who have used Cursor Agent or Claude Code and understand basic Agent concepts but are still in the mindset of “watching every session closely”.

Core Barrier: Even with awareness of Agents’ capabilities, many still struggle to escape the serial mode of “one Agent per session”—the value of Cloud Agents is not immediately apparent, and there is a lack of systematic design thinking.


1. The Essence of the Third Era: Change in Collaboration Units

Cursor’s official blog clearly defines three eras:

Era Core Collaboration Unit Human Role Time Span
First Era (Tab) Single line of code Review + Accept ~2 years
Second Era (Synchronous Agent) Single Agent session Supervise + Guide <1 year
Third Era (Cloud Agents) Multiple Agents in parallel + Work Definition Define problems + Set acceptance criteria Coming soon

“As a result, Cursor is no longer primarily about writing code. It is about helping developers build the factory that creates their software.” — Cursor Blog: The third era of AI software development

The key point here is not that “Cursor has changed” but that the collaboration unit has fundamentally shifted. In the first era, the unit was a single line of code; in the second era, it was a single Agent session; in the third era, the unit is a group of concurrently running Agents + the human-defined problem boundaries.


2. The Fundamental Bottleneck of Synchronous Agents: Why Cloud Agents are Not “Better” but “Different”

Many perceive Cloud Agents as simply “running Agents in the cloud”, but this is merely a technical description. The truly significant change lies in the fundamental shift in interaction patterns.

2.1 Two Implicit Constraints of Synchronous Agents

Synchronous Agents (running locally with real-time feedback) have two implicit constraints:

Constraint One: Resource Competition
Local machines can only run a limited number of Agent sessions simultaneously. When you open 3 Agent Tabs in Cursor, each consumes local CPU and context resources. You cannot run 10 local Agents at the same time—not due to the Agents’ capabilities, but because of resource isolation issues.

Constraint Two: Context Reconstruction Cost
The output of synchronous Agents consists of real-time diffs and chat messages. When you need to evaluate the output, you must re-enter the context of that session—looking at diffs, reading logs, reconstructing the thought process at that time. This cost restricts humans to focusing on only a few sessions.

“Cloud agents remove both constraints. Each runs on its own virtual machine, allowing a developer to hand off a task and move on to something else.” — Cursor Blog: The third era of AI software development

2.2 Core Value of Cloud Agents: Artifacts as Evaluation Medium

The key innovation of Cloud Agents is not merely running in the cloud but changing the evaluation medium.

The output of synchronous Agents includes: diff files, chat messages, terminal outputs. The output of Cloud Agents is: Artifacts (screenshots, screen recordings, live previews).

The essence of Artifacts is evaluation results without the need to reconstruct context. You can know without entering the Agent session:

  • Whether the functionality works as expected (screenshots/screen recordings)
  • Whether the interface meets design (live previews)
  • Overall progress (the quantity and quality of Artifacts)

This enables parallel evaluations—humans can review multiple Agents’ outputs simultaneously without needing to switch contexts between sessions.


3. Fundamental Shift in Human Roles: From “Supervision” to “Definition”

This is the most critical cognitive shift of the third era. The Cursor blog mentions three key behavioral changes:

“We see the developers adopting this new way of working as characterized by three traits: Agents write almost 100% of their code. They spend their time breaking down problems, reviewing artifacts, and giving feedback. They spin up multiple agents simultaneously instead of handholding one to completion.” — Cursor Blog: The third era of AI software development

3.1 Separation of Responsibilities Among Three Roles

Traditional Model Third Era Model Key Differences
Human: Writes code (or supervises Agent writing) Agent: Writes almost 100% of the code Humans completely exit the implementation layer
Human: Guides session by session Human: Breaks down problems + provides feedback Humans focus on problem decomposition rather than implementation
Human: Serial single session Human: Parallel evaluation of multiple Agents Parallel evaluation replaces serial supervision

3.2 “Problem Decomposition” as a Core Skill

In the third era, the core work of humans becomes decomposing complex problems into parallelizable subtasks and setting acceptance criteria for each subtask (in the form of Artifacts).

This aligns closely with Anthropic’s Initializer Agent design—the role of the Initializer Agent is also to “establish a complete functional map so that subsequent Agents know what else needs to be done”. However, in Anthropic’s design, this is a protocol between Agents, whereas in Cursor’s third era, it is a protocol between humans and Agents.

“The human role shifts from guiding each line of code to defining the problem and setting review criteria.” — Cursor Blog: The third era of AI software development


4. Engineering Implementation of Cursor 3: Multi-Repo + Seamless Handoff

Cursor 3 is the product realization of this paradigm shift. Three key engineering decisions:

4.1 Multi-Repo Layout

┌─────────────────────────────────────────────────────────────┐
│  Cursor 3 Sidebar                                            │
├─────────────────────────────────────────────────────────────┤
│  [Agent 1] ← repo-A (cloud, working)                        │
│  [Agent 2] ← repo-B (local, paused)                        │
│  [Agent 3] ← repo-A (cloud, completed → awaiting review)    │
│  [Agent 4] ← repo-C (local, active)                         │
└─────────────────────────────────────────────────────────────┘

All local and cloud Agents are managed uniformly in the sidebar, including Agents triggered from different channels (mobile, web, desktop, Slack, GitHub, Linear).

4.2 Seamless Handoff Between Cloud and Local

Cursor 3 allows for quick migration of Agent sessions between local and cloud:

  • Cloud → Local: Migrate a cloud Agent to local for continued editing and testing
  • Local → Cloud: Push a local Agent to the cloud to keep running (suitable before the end of the workday)

This is the product realization of Anthropic’s “multi-session state management”—no manual context export is needed; the platform automatically handles state migration.

4.3 Composer 2 as the Frontend Model

“Composer 2, our own frontier coding model with high usage limits, is great for iterating quickly.” — Cursor Blog: Cursor 3

Cursor uses its self-developed Composer 2 as the frontend model for Cloud Agents instead of calling third-party APIs—this is a strategic upgrade from “tool” to “platform”, meaning Cursor is no longer just a UI wrapper for Claude/ChatGPT but has its own model layer control for an AI Coding platform.


5. Paradigm Comparison: Applicable Boundaries of Three Agent Collaboration Modes

Mode Applicable Scenarios Inapplicable Scenarios Core Bottleneck
Tab (First Era) Small modifications, templated code Complex functions, cross-file modifications Can only handle low-entropy tasks
Synchronous Agent (Second Era) Small to medium tasks completed in a single session Tasks requiring multiple sessions or long-running tasks Resource competition + context reconstruction
Cloud Agents (Third Era) Large projects, multiple Agents in parallel, long tasks Small quick modifications (startup costs not worth it) High requirement for problem decomposition ability

The author believes: the third era will not completely replace the second era. For tasks like “writing an API endpoint” or “fixing a bug”, the feedback speed of synchronous Agents is still faster. The value of Cloud Agents lies in scalability—the advantages of the third era truly manifest when you need to handle more than 5 tasks simultaneously.


6. Engineering Practice: Migration Path from “Supervision Mode” to “Definition Mode”

The Cursor blog mentions a key figure: 35% of internal PRs at Cursor are now created by Cloud Agents. This number indicates that even within Cursor, the migration is not complete.

Key steps in the migration:

Step 1: Problem Decomposition Training
Break down large requirements into “Agent-executable + Artifact-evaluable” problem units. This requires humans to shift from “code thinking” to “acceptance thinking”.

Step 2: Pre-emptive Acceptance Criteria
Before starting an Agent, clarify:

  • What are the success criteria? (functionality screenshots, test passes, performance metrics)
  • How will evaluation be conducted? (in the form of Artifacts, PRs)
  • What are the boundaries? (what does not need to be done by the Agent)

Step 3: Parallel Experimentation with Multiple Agents
Start with “running 2 Agents simultaneously”, accumulating a sense of the rhythm of Artifact evaluation, gradually expanding to more than 5.

Step 4: Workflow Integration
Automatically map Linear/Jira issues to Agent tasks, achieving a complete pipeline of “problem in, PR out”.


7. Theoretical Resonance with Anthropic’s Dual-Component Harness

The paradigm of Cursor’s third era resonates interestingly with Anthropic’s dual-component Harness design:

Dimension Cursor Third Era Anthropic Dual-Component Harness
Human Role Define problems + set acceptance criteria Initializer Agent establishes functional map
Agent Collaboration Multiple Agents in parallel (problem decomposition) Initializer + Coding Agent division of labor
State Inheritance Seamless Handoff between Cloud and Local feature_list.json + progress.txt
Evaluation Medium Artifacts (screenshots/screen recordings) End-to-end testing + Puppeteer MCP
Failure Handling Multiple Agent failures do not affect the overall Single feature failure does not affect others

Both are addressing the same core issue: how to liberate humans from the model of “supervising every session”, but they take different paths—Cursor relies on product design (Cloud Agents + Artifacts), while Anthropic relies on Harness engineering (session protocols + Feature List).

The author believes: this indicates a consensus is forming in the industry—“supervision mode” is not a scalable way to use Agents. Whether through platform layers (Cloud Agents) or protocol layers (Feature List), the mainstream direction of the future is humans defining problem boundaries and Agents executing autonomously.


8. Conclusion and Insights

The core contribution of Cursor’s third era is not the product feature of “Cloud Agents” but the validation of a paradigm hypothesis: humans can exit the model of “supervising code” and let Agents manage themselves.

Insights for Agent developers:

  1. Problem decomposition ability will become the most scarce product/engineering capability in the Agent era.
  2. Pre-emptive acceptance criteria are more important than “writing good prompts”—prompts determine how Agents work, while acceptance criteria determine whether Agents did it correctly.
  3. The key to parallel multiple Agents is not technology but the granularity of problem decomposition—too coarse decomposition can leave some Agents idle, while too fine decomposition increases human coordination costs.

Insights for Agent framework developers:

  1. Artifact evaluation mechanisms are the core innovation of Cloud Agents; frameworks need to natively support structured output proofs.
  2. State migration protocols (Cloud ↔ Local) need to be resolved at the framework level, rather than letting each product implement it independently.
  3. Problem-task mapping (Issue → Agent Task → PR) is the only feasible entry point for automated workflows.

Was this helpful?

Likes and saves are stored in your browser on this device only (local storage) and are not uploaded to our servers.

Comments

Discussion is powered by Giscus (GitHub Discussions). Add repo, repoID, category, and categoryID under [params.comments.giscus] in hugo.toml using the values from the Giscus setup tool.