Some providers expose a separate “thinking” channel that streams the model’s reasoning metadata alongside the normal message output. This lets you display or inspect the model’s inner monologue without sending it back to the model or to end users who just want the final answer. OpenAI Responses, Anthropic, and Google all support extended thinking. The generated output is provider-agnostic, so additional providers can adopt it later without requiring any changes to your code.Documentation Index
Fetch the complete documentation index at: https://docs.dartantic.ai/llms.txt
Use this file to discover all available pages before exploring further.
Enable Thinking
Enable thinking at the Agent level using theenableThinking parameter:
Provider-Specific Defaults
Each provider has sensible defaults whenenableThinking: true:
- OpenAI Responses: Uses
reasoningSummary: detailedautomatically - Anthropic: Uses 4096 token budget for extended thinking
- Google: Uses dynamic token budget (model decides based on task complexity)
Advanced Configuration
For fine-tuning provider-specific behavior, use the options classes:OpenAI Responses
Anthropic
Key Points
- Access thinking via
result.thinkingfor both streaming and non-streaming - Thinking is also stored as
ThinkingPartin consolidated messages for history - You control where (or if) thinking is displayed
- Important: When using Anthropic with tool calls, thinking blocks are automatically preserved in conversation history as required by their API. This increases token costs on subsequent turns.
Streaming Thinking
chunk.thinking,
so you can render a live “thought bubble” while the model is working. Each
chunk may include text output (chunk.output), thinking (chunk.thinking),
or both.
Examples
Related Topics
- Streaming Output – Combine thinking with live text
- Server-Side Tools – Providers can expose both thinking metadata and intrinsic tools

