Claude Managed Agents: Dreaming, Outcomes, and Multiagent
Anthropic released three new Claude Managed Agents features on May 7, 2026: dreaming (a research preview that lets agents learn from their own session history), outcomes (a rubric-based grading system that guides agent behavior toward defined success criteria), and multiagent orchestration (a lead agent plus up to 20 parallel specialists sharing a filesystem). Outcomes and multiagent are now in public beta; dreaming is opt-in research preview.
This is a meaningful capability expansion for developers who have been building with Managed Agents. The three features address distinct failure modes: sessions that repeat the same mistakes, agents that optimize for the wrong thing, and single-threaded workflows that don't parallelize well.
Dreaming: Agents That Learn Between Sessions
The standard Managed Agents memory model stores facts and preferences within and across sessions — but it is passive. If the agent makes the same category of mistake 10 times across 10 sessions, nothing in the existing memory system surfaces that pattern. You would have to review the sessions yourself to notice it.
Dreaming runs on a schedule (configurable per agent) and does that review automatically. It reads past sessions, extracts patterns — recurring errors, workflows the agent reliably converges on, shared preferences across team members using the same agent — and restructures the memory store to surface what is high-signal.
From the official Anthropic blog: Dreaming surfaces patterns that a single agent session can't see on its own, including recurring mistakes, workflows that agents converge on, and preferences shared across a team. It also restructures memory so it stays high-signal as it evolves.
You control how much autonomy it has. Dreaming can update memory automatically, or it can queue proposed changes for your review before they land. The research preview framing means the behavior may change, but the controlled-review option makes this usable in production-adjacent workflows today.
Anthropic reports that Harvey, the legal AI company, saw task completion rates increase roughly 6x after enabling dreaming. They attribute this to the agent learning their workflow-specific conventions rather than rediscovering them each session.
Outcomes: Telling the Agent What Success Means
The Outcomes feature addresses a different problem: standard prompting tells an agent what to do, but it does not define what good output looks like. An agent optimizing for task completion can technically finish a task while producing output that misses the actual goal.
With Outcomes, you write a rubric. You specify what a successful result looks like in plain language — a grader reads this rubric and evaluates the agent's output in a separate context window, independent of the agent's reasoning trace. The separation matters: a grader that can see how the agent reached its conclusion tends to follow the same reasoning path when evaluating it, which introduces bias. Running the grader independently prevents that.
According to Anthropic's internal benchmarks, outcomes improved task success rates by up to 10 percentage points over a standard prompting loop, with the largest gains on the hardest tasks.
A minimal outcomes configuration looks like this in the Managed Agents API:
# managed_agent_config.yaml (illustrative, based on API docs structure)
agent:
model: claude-opus-4-7
prompt: "You are a customer support specialist. Handle the following ticket."
outcomes:
rubric: |
A successful response:
- Addresses the user's specific issue, not a generic version of it
- Includes a concrete next step the user can take
- Does not promise capabilities the product does not have
- Is under 200 words
grader_model: claude-sonnet-4-6
The grader runs after the agent completes and its score feeds back into the agent's improvement loop when combined with dreaming.
Multiagent Orchestration: Parallel Specialist Agents
The multiagent orchestration feature follows the pattern that has emerged as standard in production agent systems: one lead agent for coordination, multiple specialist agents for parallel execution.
The lead agent receives a task, breaks it into subtasks, and delegates each to a specialist. Each specialist has its own model, system prompt, and tool set. They run in parallel and operate on a shared filesystem, so a specialist that generates a file makes it immediately accessible to other specialists without explicit handoff messages.
The lead agent's context accumulates contributions from specialists as they complete, giving it a running view of the overall task state.
The public beta supports up to 20 parallel specialists. Netflix is using this for log processing: hundreds of build logs processed simultaneously, where each specialist handles a subset of builds and the lead agent aggregates findings.
Accessing multiagent sessions requires the beta header on API calls:
import anthropic
client = anthropic.Anthropic()
# Beta header required for managed agents features
response = client.beta.managed_sessions.create(
model="claude-opus-4-7",
beta_headers=["managed-agents-2026-04-01"],
config={
"orchestration": {
"max_specialists": 10,
"specialist_model": "claude-sonnet-4-6",
}
},
messages=[{"role": "user", "content": "Analyze these 50 support tickets..."}]
)
The full YAML config for multiagent sessions, dreaming schedules, and outcomes rubric format is in the Claude Platform API docs.
What This Means for Agent Developers
The three features solve different problems at different timescales:
| Feature | Problem it solves | Timescale |
|---|---|---|
| Dreaming | Agent repeats same mistakes across sessions | Cross-session learning |
| Outcomes | Agent optimizes for wrong proxy | Per-task quality |
| Multiagent | Single-threaded bottleneck on large workloads | Within-task parallelism |
Dreaming is the most novel of the three. Most agent memory systems are write-only: facts are added but never synthesized. The scheduled review loop that extracts patterns from session history is closer to how a human employee learns on the job than what existing agent memory APIs provide. The research preview status means the interface will change, but the directional investment seems firm.
Outcomes fills a gap that has existed since the start of Managed Agents. Without a rubric-based grader, defining agent success means either writing elaborate system prompts or evaluating outputs manually. Outcomes externalizes that evaluation step and makes it programmable.
Multiagent orchestration is the most immediately applicable feature for developers with existing parallelizable workflows. If your agent today processes items sequentially, this is a direct replacement with up to 20× throughput on independent subtasks.
Current Availability
- Outcomes: Public beta, no access request required
- Multiagent orchestration: Public beta, no access request required
- Dreaming: Research preview, opt-in
- Beta header:
managed-agents-2026-04-01
All three are accessible through the standard Claude Platform API at platform.claude.com.
Sources
- New in Claude Managed Agents: dreaming, outcomes, and multiagent orchestration — Official Anthropic blog, 2026-05-07
- Claude Managed Agents overview — API documentation
- Dreams API docs — Official reference
- VentureBeat coverage — Harvey case study context
Need content like this
for your blog?
We run AI-powered technical blogs. Start with a free 3-article pilot.