> ## Documentation Index
> Fetch the complete documentation index at: https://docs.dartantic.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Embeddings

> Some providers produce embeddings as well as chat responses.

An AI embedding is a numerical vector representation of data (like text, images,
or audio) that captures its meaning and context, allowing AI models to
understand and work with complex, unstructured information. By converting
diverse data types into numbers, embeddings place similar items close together
in a high-dimensional space, making it easier for algorithms to find patterns,
perform searches, and group information based on semantic relationships rather
than just keyword matches.

Dartantic has full support for creating embeddings using providers that expose
embeddings models.

## Basic Usage

You can turn a string into a embedding using one of the `embedXxx` methods. The
`embedQuery` method works on a single string and is generally meant for lookups
(which we'll see later). The `embedDocuments` method works on a batch of
strings.

```dart theme={null}
final agent = Agent('openai');

// Single text
final result = await agent.embedQuery('Hello world');
print(result.embeddings.length); // 1536

// Multiple texts
final responsesAgent = Agent('openai-responses');
final results = await responsesAgent.embedDocuments([
  'Machine learning',
  'Deep learning',
  'Neural networks'
]);
```

Strings for embeddings are length-limited as per the underlying model, so you'll
have to split larger strings according to the requirements for that model.

## Similarity

There are several ways to do searches with embeddings, many of the advanced ones
beyond the scope of this documentation, but one simple, built-in way is via
cosine similarity. Cosine similarity is a metric that measures the angle between
two non-zero vectors to determine their similarity, with a score of 1 meaning
they point in the same direction, 0 meaning they are unrelated, and -1 meaning
they point in opposite directions.

What this means for embeddings is cosine similarity useful for doing searches:
the higher the number, the closer the match.

```dart theme={null}
// Compare two texts
final embed1 = await agent.embedQuery('cat');
final embed2 = await agent.embedQuery('dog');

final similarity = EmbeddingsModel.cosineSimilarity(
  embed1.embeddings,
  embed2.embeddings,
);
print(similarity); // 0.8234
```

## Search Example

It's pretty typical to use `embedDocuments` to calculate embeddings for
documents to be searched against via `embedQuery` like so:

```dart theme={null}
// Find most similar
final query = await agent.embedQuery('programming');
final docs = await agent.embedDocuments([
  'Dart language',
  'Cooking recipes', 
  'Python coding'
]);

// Get similarities
final sims = docs.embeddings.map((e) => 
  EmbeddingsModel.cosineSimilarity(query.embeddings, e)
).toList();

// Find best match
final best = sims.indexOf(sims.reduce(max));
print('Best match: index $best');
```

Some embedding models have the same underlying implementation for calculating
query vs. documentation embeddings, but following this pattern will always get
you to the right place.

## Configuration

Different embedding models from different providers have different options, e.g.
it's common to be able to change the default size of the calculated embeddings
vector.

```dart theme={null}
// Custom model
Agent('openai?embeddings=text-embedding-3-large');

// Reduce dimensions (OpenAI / OpenAI Responses)
final agent = Agent(
  'openai-responses',
  embeddingsModelOptions: OpenAIEmbeddingsModelOptions(
    dimensions: 256, // Smaller vectors
  ),
);

// Reduce dimensions (Mistral)
// Note: Custom dimensions require a compatible model like codestral-embed-2505
// The default mistral-embed model does not support custom dimensions
final mistralAgent = Agent(
  'mistral?embeddings=codestral-embed-2505',
  embeddingsModelOptions: MistralEmbeddingsModelOptions(
    dimensions: 256, // Smaller vectors
  ),
);
```

It takes longer to calculate more dimensions but a larger vector gives more
accurate comparison results. Note that not all embedding models support custom
dimensions - check your provider's documentation for model-specific
capabilities.

## Vectors are Provider-Specific

Even though the concept of embeddings and how to use them is the same across
providers and many providers calculate embeddings that are the same size, an
embedding calculated with one providers, e.g. Google, is NOT compatible with an
embedding calcuated with another providers, e.g. OpenAI. You'll need to track
the source and maintain integrity yourself per the requirements of your
applications.

## Examples

* [Embeddings](https://github.com/csells/dartantic_ai/blob/main/packages/dartantic_ai/example/bin/embeddings.dart)

## Next Steps

* [Providers](/providers) - Embeddings support by provider
