Kimi's agent swarm runs up to 100 parallel sub-agents to escape context window limits

The interesting part isn’t the parallelism itself, it’s the reason for it. A single agent on a long-horizon task eventually hits its context window and starts compressing earlier context, degrading output quality. Kimi’s solution: spawn up to 100 sub-agents, each with their own fresh context, coordinating as a swarm.

The system is self-organizing: it determines how many agents to deploy and how to split the work based on the task. In benchmarks they cite 4.5x faster results and over 1,500 tool calls per task.

The framing they use is a shift from “bigger models” (vertical scaling) to organizational intelligence (horizontal scaling): one brain vs. a company or a laboratory.

Haven’t tried it yet, but the architecture is a clean answer to a real constraint.

Kimi Agent Swarm