Detailed Summary
Introduction to Multi-Agent Orchestration (0:00 - 1:41)
The video introduces Claude Opus 4.6 as a powerful new model but quickly shifts focus to the evolving landscape of agentic engineering, where multi-agent orchestration is becoming paramount. The presenter highlights that the true limitations are no longer the models themselves but the engineer's ability to prompt and context engineer outcomes and build reusable agentic systems. The demonstration will use eight unique full-stack applications created by Opus 4.6 within E2B agent sandboxes to explore multi-agent orchestration and observability.
- Claude Opus 4.6 is a high-performing model, but the focus is on orchestration.
- The constraint in agentic engineering is now human capability in prompt and context engineering.
- The goal is to build reusable, powerful agentic layers.
- Eight full-stack applications, one-shotted by Opus, will serve as the demonstration playground.
- The core ideas to be explored are multi-agent orchestration and multi-agent observability.
Setting Up for Orchestration with Tmux (1:42 - 2:50)
To enable multi-agent orchestration, the video demonstrates setting up Claude Code within a Tmux session. This involves exporting the CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS feature to enable team sessions. The initial task for the primary agent is to create an agent team to summarize and set up each codebase found in the agent sandbox directories.
- A new Claude Code instance is initiated, with session start and end hooks captured by the observability system.
- Claude Code is booted using Tmux to enable powerful multi-pane capabilities.
- The
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS feature is enabled by setting an environment variable to 1.
- The primary agent is tasked with building a new agent team for each codebase and having an agent summarize its setup.
First Agent Team Execution and Observability (2:51 - 5:59)
The primary agent creates a task list and then assigns tasks to individual sub-agents, which are spawned in new Tmux panes. These sub-agents, identified as Haiku agents, work in parallel to analyze and summarize each codebase. The multi-agent observability system tracks all events and tool calls, demonstrating the scale of parallel compute. After completing their tasks, the sub-agents shut down, and the primary agent compiles a summary of their work.
- The agent first creates a task list, a centralized hub for workflow.
- Tmux panes are opened for each sub-agent, allowing parallel execution and visualization.
- Eight Haiku agents are kicked off, each with its own context window, to summarize and understand the codebases.
- The observability system captures over 160 tool calls in under a minute, showcasing scaled compute.
- Upon completion, sub-agents finish their work, and their panes close, leaving only the primary agent.
- The primary agent then summarizes the work done by the eight agents, utilizing only 31% of its context window.
Orchestrating Two Teams with Agent Sandboxes (6:00 - 10:46)
The demonstration progresses to spinning up new agent sandbox instances for the applications, this time using two teams of four Opus 4.6 agents. The presenter carefully crafts a prompt to trigger the TeamCreate tool and utilize a custom agent sandbox skill with a reboot command to rehost applications. The observability system continues to track the extensive parallel activity as agents run setup commands and reboot applications within their isolated environments.
- The primary agent is prompted to build two new agent teams, each with four agents, to mount the eight applications.
- Specific prompt engineering uses keywords like "Build a new agent team" and references the
agent sandbox skill and backslash command.
- The agent first runs the
agent sandbox skill and backslash command skill to understand the reboot function.
- A task list is created, and the first team of four Opus 4.6 agents is kicked off in parallel Tmux panes.
- Each agent runs its own setup commands and reboots its assigned application within an E2B agent sandbox.
- The observability system shows extensive tool calls and the impact of each agent.
Managing Parallel Teams and Addressing Issues (10:47 - 16:04)
While the first team is working, the presenter manually initiates a second team of four agents in a new Tmux session to handle the remaining applications. The video highlights the iterative nature of agentic workflows, where ad-hoc agent teams can be spun up to resolve specific issues. It also touches upon API usage and the importance of observability during complex, scaled operations.
- The presenter manually initiates a second team of four agents in a separate Tmux session to process the remaining four applications.
- The primary agent is asked to list sandbox directories not yet processed, demonstrating its ability to provide status updates.
- The presenter notes running out of API usage for the day, highlighting the cost implications of scaling compute.
- The observability system continues to provide full visibility into both teams' activities.
- The first team successfully mounts four applications, but two applications from the initial set have missing data, requiring further agent intervention.
Iterative Problem Solving and Workflow (16:05 - 20:10)
An ad-hoc agent team is created to address the missing data issues in two of the applications. The video then outlines the full multi-agent orchestration workflow: team creation, task creation, agent spawning, parallel work, agent shutdown, and team deletion. It emphasizes the importance of deleting agents after work to enforce good context engineering practices. The presenter also demonstrates managing 24 running agent sandboxes, showcasing the ability to manage compute at scale.
- A new agent team is spun up to resolve the missing data issues in two applications.
- The full multi-agent orchestration workflow is described: TeamCreate, TaskCreate, spawn agents, parallel work, shutdown, TeamDelete.
- Deleting agents after work is crucial for resetting context and maintaining efficient workflows.
- The presenter shows 24 running agent sandboxes, demonstrating extensive parallel compute management.
- Agent sandboxes are highlighted as a significant trend for scaling agent capabilities securely.
Core Principles and New Tools (20:11 - 23:59)
The video concludes by reiterating the "core four" principles: context, model, prompt, and tools, emphasizing that human engineers remain the ultimate constraint and accelerator. It details the new Claude Code orchestration tools, categorized into team management (TeamCreate, TeamDelete), task management (TaskCreate, TaskList, TaskGet, TaskUpdate), and communications (SendMessage). The presenter encourages engineers to dig into these capabilities, offering resources like his multi-agent observability system and "Tactical Agentic Coding" course for further learning.
- All agentic engineering boils down to the "core four": context, model, prompt, tools.
- Engineers' understanding and application of tools are the primary limitations and drivers of progress.
- New Claude Code tools include
TeamCreate, TeamDelete, TaskCreate, TaskList, TaskGet, TaskUpdate, and SendMessage.
SendMessage is highlighted as crucial for inter-agent communication.
- The complete multi-agent orchestration workflow is reinforced as creating a team, tasks, spawning agents, parallel work, shutting down, and deleting the team.
- The presenter encourages continuous learning and provides links to his observability system, agent sandbox skill, and "Tactical Agentic Coding" course.