embedder#

The component that orchestrates model client (Embedding models in particular) and output processors.

Classes

BatchEmbedder(embedder[, batch_size])

Adds batching to the embedder component.

Embedder(*, model_client[, model_kwargs, ...])

A user-facing component that orchestrates an embedder model via the model client and output processors.

class Embedder(*, model_client: ModelClient, model_kwargs: Dict[str, Any] = {}, output_processors: Component | None = None)[source]#

Bases: Component

A user-facing component that orchestrates an embedder model via the model client and output processors.

Parameters:
  • model_client (ModelClient) – The model client to use for the embedder.

  • model_kwargs (Dict[str, Any], optional) – The model kwargs to pass to the model client. Defaults to {}.

  • output_processors (Optional[Component], optional) – The output processors after model call. Defaults to None. If you want to add further processing, it should operate on the EmbedderOutput data type.

input: a single str or a list of str. When a list is used, the list is processed as a batch of inputs in the model client.

Note

  • The output_processors will be applied only on the data field of EmbedderOutput, which is a list of Embedding.

  • Use BatchEmbedder for automatically batching input of large size, larger than 100.

model_type: ModelType = 1#
model_client: ModelClient#
output_processors: Component | None#
classmethod from_config(config: Dict[str, Any]) Embedder[source]#

Create an Embedder from a configuration dictionary.

Example:

embedder_config =  {
    "model_client": {
        "component_name": "OpenAIClient",
        "component_config": {}
    },
    "model_kwargs": {
        "model": "text-embedding-3-small",
        "dimensions": 256,
        "encoding_format": "float"
    }
}

embedder = Embedder.from_config(embedder_config)
call(input: str | Sequence[str], model_kwargs: Dict | None = {}) EmbedderOutput[source]#
async acall(input: str | Sequence[str], model_kwargs: Dict | None = {}) EmbedderOutput[source]#

API call, file io.

class BatchEmbedder(embedder: Embedder, batch_size: int = 100)[source]#

Bases: Component

Adds batching to the embedder component.

Parameters:
  • embedder (Embedder) – The embedder to use for batching.

  • batch_size (int, optional) – The batch size to use for batching. Defaults to 100.

call(input: str | Sequence[str], model_kwargs: Dict | None = {}) List[EmbedderOutput][source]#

Call the embedder with batching.

Parameters:
  • input (BatchEmbedderInputType) – The input to the embedder. Use this when you have a large input that needs to be batched. Also ensure

  • memory. (the output can fit into)

  • model_kwargs (Optional[Dict], optional) – The model kwargs to pass to the embedder. Defaults to {}.

Returns:

The output from the embedder.

Return type:

BatchEmbedderOutputType