generateMedia() API that works across multiple
providers. Each provider uses its own underlying mechanism (native image
generation, code execution, etc.) but the API surface is consistent.
Provider Support
| Provider | Images | PDFs | CSV/Files | Mechanism |
|---|---|---|---|---|
| OpenAI Responses | Native + Code | Code | Code | DALL-E + Code Interpreter |
| Native + Code | Code | Code | Nano Banana Pro (gemini-3-pro-image-preview) + Code Execution | |
| Anthropic | Code | Code | Code | Code Interpreter (matplotlib, reportlab) |
Basic Usage
Example MIME Types
ThemimeTypes parameter tells the provider what kind of output you want, e.g.
image/png,image/jpeg- Imagesapplication/pdf- PDF documentstext/csv- CSV files- Other file types depend on provider code execution capabilities
Streaming Generation
UsegenerateMediaStream() to receive incremental updates:
Specifying Media Models
Use the model string to specify which media model to use:Provider-Specific Options
Each provider accepts options for customization:OpenAI Responses
Anthropic
Result Structure
MediaGenerationResult contains:
assets- List ofPartobjects (typicallyDataPartwith bytes)links- List ofLinkPartfor hosted URLs (if applicable)messages- Conversation messages generated during the runmetadata- Provider-specific metadata (previews, progress, etc.)isComplete- Whether generation is finished (for streaming)
Examples
Related Topics
- Server-Side Tools - Lower-level access to code interpreters
- Multimedia Input - Send images and files to models

