File Search
Search through uploaded documents and files with OpenAI's File Search tool
File Search
The File Search tool allows models to search through documents and files that have been uploaded to OpenAI. This enables semantic search across your knowledge base, documentation, or any text-based content you've stored with OpenAI.
Prerequisites
Before using File Search, you need to:
- Upload files to OpenAI using their Files API
- Create a vector store and add your files to it
- Note the file IDs or vector store ID for configuration
Using the OpenAI CLI
# Upload a single file
openai files create \
--file documentation.pdf \
--purpose assistants
# Response includes file ID:
# {
# "id": "file-abc123...",
# "purpose": "assistants",
# ...
# }
Using the OpenAI API
curl https://api.openai.com/v1/files \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-F purpose="assistants" \
-F file="@documentation.pdf"
Creating a Vector Store
# Create a vector store
curl https://api.openai.com/v1/vector_stores \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "My Documentation",
"file_ids": ["file-abc123...", "file-xyz789..."]
}'
# Response includes vector store ID:
# {
# "id": "vs_abc123...",
# ...
# }
Basic Usage
import 'package:dartantic_ai/dartantic_ai.dart';
final agent = Agent(
'openai-responses:gpt-4o',
chatModelOptions: const OpenAIResponsesChatOptions(
serverSideTools: {OpenAIServerSideTool.fileSearch},
),
);
final response = await agent.send(
'Search for information about error handling best practices'
);
Configuration Options
Customize file search behavior with FileSearchConfig
:
final agent = Agent(
'openai-responses:gpt-4o',
chatModelOptions: OpenAIResponsesChatOptions(
serverSideTools: {OpenAIServerSideTool.fileSearch},
fileSearchConfig: FileSearchConfig(
maxResults: 10, // Maximum number of search results
metadataFilters: {
'category': 'technical',
'version': '2.0',
},
),
),
);
Configuration Parameters
- maxResults: Limit the number of search results to return (default: 20)
- metadataFilters: Filter results by metadata fields you've added to your files
Monitoring Search Activity
File search provides metadata events during execution:
await for (final chunk in agent.sendStream(prompt)) {
// Display the response
if (chunk.output.isNotEmpty) print(chunk.output);
// Monitor file search activity
final fileSearch = chunk.metadata['file_search'];
if (fileSearch != null) {
final stage = fileSearch['stage'];
print('File search: $stage');
if (fileSearch['data'] != null) {
final data = fileSearch['data'];
// Show search query
if (data['query'] != null) {
print('Searching for: ${data['query']}');
}
// Show results count
if (data['results'] != null && data['results'] is List) {
final results = data['results'] as List;
print('Found ${results.length} relevant sections');
// Preview first result
if (results.isNotEmpty) {
final first = results[0];
if (first['content'] != null) {
final preview = first['content'].toString();
print('First match: ${preview.substring(0, 100)}...');
}
}
}
}
}
}
Documentation Search
// After uploading your API documentation
final response = await agent.send(
'How do I authenticate API requests in our system?'
);
Knowledge Base Queries
// After uploading company policies
final response = await agent.send(
'What is our remote work policy regarding time zones?'
);
Technical Reference
// After uploading technical specifications
final response = await agent.send(
'What are the performance requirements for the database module?'
);
Legal Document Search
// After uploading contracts and agreements
final response = await agent.send(
'Find all clauses related to intellectual property rights'
);
Research Papers
// After uploading academic papers
final response = await agent.send(
'Summarize the findings about machine learning in healthcare'
);
File Types Supported
File Search works with:
- PDF documents
- Text files (.txt, .md)
- Word documents (.docx)
- HTML files
- JSON/CSV data files
- Code files (various programming languages)
Best Practices
-
Organize Files: Use meaningful file names and metadata
// When uploading files, add metadata curl https://api.openai.com/v1/files \ -F file="@doc.pdf" \ -F purpose="assistants" \ -F metadata='{"category": "api", "version": "2.0"}'
-
Use Metadata Filters: Narrow searches to relevant documents
fileSearchConfig: FileSearchConfig( metadataFilters: {'department': 'engineering'}, )
-
Chunk Large Documents: Break very large documents into sections for better search
-
Regular Updates: Keep your vector store updated with latest documents
-
Clear Queries: Be specific about what you're looking for
// Good 'Find the deployment process for production environments' // Less effective 'deployment info'
List Uploaded Files
curl https://api.openai.com/v1/files \
-H "Authorization: Bearer $OPENAI_API_KEY"
Delete Files
curl -X DELETE https://api.openai.com/v1/files/file-abc123 \
-H "Authorization: Bearer $OPENAI_API_KEY"
Update Vector Store
curl -X POST https://api.openai.com/v1/vector_stores/vs_abc123/files \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"file_id": "file-new123"}'
Limitations
- File Size: Individual files limited to 512MB
- Total Storage: Account limits apply
- File Types: Only text-extractable formats supported
- Processing Time: Large files may take time to index
- Search Scope: Only searches within uploaded files
- Cost: Storage and retrieval incur charges
Error Handling
Common issues and solutions:
try {
final response = await agent.send('Search for...');
} catch (e) {
// Possible errors:
// - No files uploaded
// - Vector store not configured
// - Files still being processed
// - Exceeded storage limits
print('File search error: $e');
}
No Results Found
If searches return no results:
final response = await agent.send('Search for specific term');
// Check if files are properly uploaded
if (response.output.contains('no results') ||
response.output.contains('couldn\'t find')) {
print('No matches found. Ensure files are uploaded and indexed.');
}
Cost Considerations
File Search incurs costs for:
- File storage (per GB per month)
- Vector store operations
- Retrieval operations during search
Check OpenAI pricing for current rates.
Files Not Being Searched
- Verify files are uploaded with purpose="assistants"
- Ensure files are added to a vector store
- Check file processing status
Poor Search Results
- Review file content quality
- Use more specific search queries
- Check if files contain searchable text (not images)