nn.Agent
The Agent is a Module that uses language models to solve tasks. It can handle multimodal data, interact with environments through tool calls, and manage complex workflows with structured outputs.
✦₊⁺ Overview
An Agent combines a language model with instructions and tools to accomplish tasks. The Agent module adopts a task decomposition strategy, allowing each part of a task to be treated in isolation.
Key Features
- Multimodal Support: Handle text, images, audio, video, and files
- Tool Calling: Execute functions to interact with external systems
- Generation Schemas: Guides the model to generate typed responses, with support for reasoning strategies: Chain of Thought, ReAct, Self-Consistency
- Flexible Configuration: Customize behavior through message fields and config options
- Template System: Use Jinja templates for prompts and responses
- Modular System Prompt: Compose system prompts from independent components
- Task Decomposition: Break down complex tasks into manageable parts
1. Contents
| Topic | Description |
|---|---|
| Quick Start | Get started with a minimal agent, class-based definition (AutoParams) |
| Async | Asynchronous agent execution |
| Streaming | Real-time response streaming |
| Reasoning | Model-level reasoning, reasoning_in_response, dual-queue streaming |
| How to Debug an Agent | Inspection and debugging tools |
| System Prompt Components | Compose system prompts from components |
| Generation Schemas | Structured outputs and reasoning strategies |
| Task and Context | Input handling, templates, multimodal, chat history |
| Vars | Unified execution variable space |
| Tools | Tool calling, configuration, MCP, agent-as-tool |
| Signatures | Declarative input/output specifications |
| Hooks & Guards | Input and output safety checks with configurable policy |
| Model Gateway | Multi-model routing |
| Prefilling | Guide response format with prefilling |