embedder#
The component that orchestrates model client (Embedding models in particular) and output processors.
Classes
|
Adds batching to the embedder component. |
|
A user-facing component that orchestrates an embedder model via the model client and output processors. |
- class Embedder(*, model_client: ModelClient, model_kwargs: Dict[str, Any] = {}, output_processors: Component | None = None)[source]#
Bases:
Component
A user-facing component that orchestrates an embedder model via the model client and output processors.
- Parameters:
model_client (ModelClient) – The model client to use for the embedder.
model_kwargs (Dict[str, Any], optional) – The model kwargs to pass to the model client. Defaults to {}.
output_processors (Optional[Component], optional) – The output processors after model call. Defaults to None. If you want to add further processing, it should operate on the
EmbedderOutput
data type.
input: a single str or a list of str. When a list is used, the list is processed as a batch of inputs in the model client.
Note
The
output_processors
will be applied only on the data field ofEmbedderOutput
, which is a list ofEmbedding
.Use
BatchEmbedder
for automatically batching input of large size, larger than 100.
- model_type: ModelType = 1#
- model_client: ModelClient#
- output_processors: Component | None#
- classmethod from_config(config: Dict[str, Any]) Embedder [source]#
Create an Embedder from a configuration dictionary.
Example:
embedder_config = { "model_client": { "component_name": "OpenAIClient", "component_config": {} }, "model_kwargs": { "model": "text-embedding-3-small", "dimensions": 256, "encoding_format": "float" } } embedder = Embedder.from_config(embedder_config)
- class BatchEmbedder(embedder: Embedder, batch_size: int = 100)[source]#
Bases:
Component
Adds batching to the embedder component.
- Parameters:
embedder (Embedder) – The embedder to use for batching.
batch_size (int, optional) – The batch size to use for batching. Defaults to 100.
- call(input: str | Sequence[str], model_kwargs: Dict | None = {}) List[EmbedderOutput] [source]#
Call the embedder with batching.
- Parameters:
input (BatchEmbedderInputType) – The input to the embedder. Use this when you have a large input that needs to be batched. Also ensure
memory. (the output can fit into)
model_kwargs (Optional[Dict], optional) – The model kwargs to pass to the embedder. Defaults to {}.
- Returns:
The output from the embedder.
- Return type:
BatchEmbedderOutputType