openai_client

OpenAI ModelClient integration.

Functions

estimate_token_count(text)

Estimate the token count of a given text.

get_response_output_text(response)

Used to extract the data field for the reasoning model

handle_streaming_response(stream)

Async generator that processes a stream of SSE events from client.responses.create(..., stream=True).

handle_streaming_response_sync(stream)

Synchronous version: Iterate over an SSE stream from client.responses.create(..., stream=True), logging each raw event and yielding non-empty text fragments.

parse_response_output(response)

Parse response output that may include various types of content and tool calls.

Classes

OpenAIClient([api_key, ...])

A component wrapper for the OpenAI API client.

ParsedResponseContent([text, images, ...])

Structured container for parsed response content from OpenAI Response API.

class ParsedResponseContent(text: str | None = None, images: str | List[str] | None = None, tool_calls: List[Dict[str, Any]] | None = None, reasoning: List[Dict[str, Any]] | None = None, code_outputs: List[Dict[str, Any]] | None = None, raw_output: Any | None = None)[source]

Bases: object

Structured container for parsed response content from OpenAI Response API.

This dataclass provides a consistent interface for accessing different types of content that can be returned by the Response API, including text, images, tool calls, reasoning chains, and more.

text

The main text content from the response

Type:

str | None

images

List of image data (base64 or URLs) from image generation

Type:

str | List[str] | None

tool_calls

List of other tool call results

Type:

List[Dict[str, Any]] | None

reasoning

Reasoning chain from reasoning models

Type:

List[Dict[str, Any]] | None

code_outputs

Outputs from code interpreter

Type:

List[Dict[str, Any]] | None

raw_output

The original output array for advanced processing

Type:

Any | None

text: str | None = None
images: str | List[str] | None = None
tool_calls: List[Dict[str, Any]] | None = None
reasoning: List[Dict[str, Any]] | None = None
code_outputs: List[Dict[str, Any]] | None = None
raw_output: Any | None = None
get_response_output_text(response: Response) str[source]

Used to extract the data field for the reasoning model

parse_response_output(response: Response) ParsedResponseContent[source]

Parse response output that may include various types of content and tool calls.

The output array can contain: - Output messages (with nested content items) - Tool calls (file search, function, web search, computer use, etc.) - Reasoning chains - Image generation calls - Code interpreter calls - And more…

Returns:

Structured content with typed access to all response data

Return type:

ParsedResponseContent

estimate_token_count(text: str) int[source]

Estimate the token count of a given text.

Parameters:

text (str) – The text to estimate token count for.

Returns:

Estimated token count.

Return type:

int

async handle_streaming_response(stream: AsyncIterable[Any]) AsyncGenerator[str, None][source]

Async generator that processes a stream of SSE events from client.responses.create(…, stream=True).

Parameters:

stream – An async iterable of SSE events from the OpenAI API

Yields:

str – Non-empty text fragments parsed from the stream events

handle_streaming_response_sync(stream: Iterable) Generator[source]

Synchronous version: Iterate over an SSE stream from client.responses.create(…, stream=True), logging each raw event and yielding non-empty text fragments.

class OpenAIClient(api_key: str | None = None, non_streaming_chat_completion_parser: Callable[[Completion], Any] | None = None, streaming_chat_completion_parser: Callable[[Completion], Any] | None = None, non_streaming_response_parser: Callable[[Response], Any] | None = None, streaming_response_parser: Callable[[Response], Any] | None = None, input_type: Literal['text', 'messages'] = 'text', base_url: str = 'https://api.openai.com/v1/', env_api_key_name: str = 'OPENAI_API_KEY', organization: str | None = None, headers: Dict[str, str] | None = None)[source]

Bases: ModelClient

A component wrapper for the OpenAI API client.

Support both embedding and response API, including multimodal capabilities.

Users (1) simplify use Embedder and Generator components by passing OpenAIClient() as the model_client. (2) can use this as an example to create their own API client or extend this class(copying and modifing the code) in their own project.

Note

We suggest users not to use response_format to enforce output data type or tools and tool_choice in your model_kwargs when calling the API. We do not know how OpenAI is doing the formating or what prompt they have added. Instead - use OutputParser for response parsing and formating.

For multimodal inputs, provide images in model_kwargs[“images”] as a path, URL, or list of them. The model must support vision capabilities (e.g., gpt-4o, gpt-4o-mini, o1, o1-mini).

For image generation, use model_type=ModelType.IMAGE_GENERATION and provide: - model: “dall-e-3” or “dall-e-2” - prompt: Text description of the image to generate - size: “1024x1024”, “1024x1792”, or “1792x1024” for DALL-E 3; “256x256”, “512x512”, or “1024x1024” for DALL-E 2 - quality: “standard” or “hd” (DALL-E 3 only) - n: Number of images to generate (1 for DALL-E 3, 1-10 for DALL-E 2) - response_format: “url” or “b64_json”

Examples

Basic text generation:

from adalflow.components.model_client import OpenAIClient
from adalflow.core import Generator

# Initialize client (uses OPENAI_API_KEY env var by default)
client = OpenAIClient()

# Create a generator for text
generator = Generator(
    model_client=client,
    model_kwargs={"model": "gpt-4o-mini"}
)

# Generate response
response = generator(prompt_kwargs={"input_str": "What is machine learning?"})
print(response.data)

Multimodal with URL image:

# Vision model with image from URL
generator = Generator(
    model_client=OpenAIClient(),
    model_kwargs={
        "model": "gpt-4o",
        "images": "https://example.com/chart.jpg"
    }
)

response = generator(
    prompt_kwargs={"input_str": "Analyze this chart and explain the trends"}
)

Multimodal with local images:

# Multiple local images
generator = Generator(
    model_client=OpenAIClient(),
    model_kwargs={
        "model": "gpt-4o",
        "images": [
            "/path/to/image1.jpg",
            "/path/to/image2.png"
        ]
    }
)

response = generator(
    prompt_kwargs={"input_str": "Compare these two images"}
)

Pre-formatted images with custom encoding:

import base64
from adalflow.core.functional import encode_image

# Option 1: Using the encode_image helper
base64_img = encode_image("/path/to/image.jpg")

# Option 2: Manual base64 encoding
with open("/path/to/image.png", "rb") as f:
    base64_img = base64.b64encode(f.read()).decode('utf-8')

# Use pre-formatted image data
generator = Generator(
    model_client=OpenAIClient(),
    model_kwargs={
        "model": "gpt-4o",
        "images": [
            # Pre-formatted as base64 data URI
            f"data:image/png;base64,{base64_img}",
            # Or as a dict with type and image_url
            {
                "type": "input_image",
                "image_url": f"data:image/jpeg;base64,{base64_img}"
            },
            # Mix with regular URLs
            "https://example.com/chart.jpg"
        ]
    }
)

response = generator(
    prompt_kwargs={"input_str": "Analyze these images"}
)

Reasoning models (O1, O3):

from adalflow.core.types import ModelType

# O3 reasoning model with effort configuration
generator = Generator(
    model_client=OpenAIClient(),
    model_type=ModelType.LLM_REASONING,
    model_kwargs={
        "model": "o3",
        "reasoning": {
            "effort": "medium",  # low, medium, high
            "summary": "auto"    # detailed, auto, none
        }
    }
)

response = generator(
    prompt_kwargs={"input_str": "Solve this complex problem: ..."}
)

Image generation with DALL-E (legacy method):

from adalflow.core.types import ModelType

# Generate an image using ModelType.IMAGE_GENERATION
generator = Generator(
    model_client=OpenAIClient(),
    model_type=ModelType.IMAGE_GENERATION,
    model_kwargs={
        "model": "dall-e-3",
        "size": "1024x1792",
        "quality": "hd",
        "n": 1
    }
)

response = generator(
    prompt_kwargs={"input_str": "A futuristic city with flying cars at sunset"}
)
# response.data contains the image URL or base64 data

Image generation via tools (new API):

import base64

# Generate images using the new tools API
generator = Generator(
    model_client=OpenAIClient(),
    model_kwargs={
        "model": "gpt-4o-mini",  # or any model that supports tools
        "tools": [{"type": "image_generation"}]
    }
)

# Generate an image
response = generator(
    prompt_kwargs={
        "input_str": "Generate an image of a gray tabby cat hugging an otter with an orange scarf"
    }
)

# Access the generated image(s)
if isinstance(response.data, list):
    # Multiple images
    for i, img_base64 in enumerate(response.data):
        with open(f"generated_{i}.png", "wb") as f:
            f.write(base64.b64decode(img_base64))
elif isinstance(response.data, str):
    # Single image
    with open("generated.png", "wb") as f:
        f.write(base64.b64decode(response.data))
elif isinstance(response.data, dict) and "images" in response.data:
    # Mixed response with text and images
    print("Text:", response.data["text"])
    for i, img_base64 in enumerate(response.data["images"]):
        with open(f"generated_{i}.png", "wb") as f:
            f.write(base64.b64decode(img_base64))

Embeddings:

from adalflow.core import Embedder

# Create embedder
embedder = Embedder(
    model_client=OpenAIClient(),
    model_kwargs={"model": "text-embedding-3-small"}
)

# Generate embeddings
embeddings = embedder(input=["Hello world", "Machine learning"])
print(embeddings.data)  # List of embedding vectors

Streaming responses:

from adalflow.components.model_client.utils import extract_text_from_response_stream

# Enable streaming
generator = Generator(
    model_client=OpenAIClient(),
    model_kwargs={
        "model": "gpt-4o",
        "stream": True
    }
)

# Stream the response
response = generator(prompt_kwargs={"input_str": "Tell me a story"})

# Extract text from Response API streaming events
for event in response.raw_response:
    text = extract_text_from_response_stream(event)
    if text:
        print(text, end="")

Custom API endpoint:

# Use with third-party providers or local models
client = OpenAIClient(
    base_url="https://api.custom-provider.com/v1/",
    api_key="your-api-key",
    headers={"X-Custom-Header": "value"}
)
Parameters:
  • api_key (Optional[str], optional) – OpenAI API key. Defaults to None.

  • non_streaming_chat_completion_parser (Callable[[Completion], Any], optional) – Legacy parser for chat completions. Defaults to None (deprecated).

  • streaming_chat_completion_parser (Callable[[Completion], Any], optional) – Legacy parser for streaming chat completions. Defaults to None (deprecated).

  • non_streaming_response_parser (Callable[[Response], Any], optional) – The parser for non-streaming responses. Defaults to get_response_output_text.

  • streaming_response_parser (Callable[[Response], Any], optional) – The parser for streaming responses. Defaults to handle_streaming_response.

  • input_type (Literal["text", "messages"]) – Input type for the client. Defaults to “text”.

  • base_url (str) – The API base URL to use when initializing the client. Defaults to “https://api.openai.com/v1/”, but can be customized for third-party API providers or self-hosted models.

  • env_api_key_name (str) – The environment variable name for the API key. Defaults to “OPENAI_API_KEY”.

  • organization (Optional[str], optional) – OpenAI organization key. Defaults to None.

  • headers (Optional[Dict[str, str]], optional) – Additional headers to include in API requests. Defaults to None.

References

Note

  • Ensure each OpenAIClient instance is used by one generator only.

init_sync_client()[source]
init_async_client()[source]
parse_chat_completion(completion: Response | AsyncIterable) GeneratorOutput[source]

Parse the Response API completion and put it into the raw_response. Fully migrated to Response API only.

track_completion_usage(completion: Response | AsyncIterable) ResponseUsage[source]

Track usage for Response API only.

parse_embedding_response(response: CreateEmbeddingResponse) EmbedderOutput[source]

Parse the embedding response to a structure Adalflow components can understand.

Should be called in Embedder.

convert_inputs_to_api_kwargs(input: Any | None = None, model_kwargs: Dict = {}, model_type: ModelType = ModelType.UNDEFINED) Dict[source]

Specify the API input type and output api_kwargs that will be used in _call and _acall methods. Convert the Component’s standard input, and system_input(chat model) and model_kwargs into API-specific format. For multimodal inputs, images can be provided in model_kwargs[“images”] as a string path, URL, or list of them. The model specified in model_kwargs[“model”] must support multimodal capabilities when using images.

Parameters:
  • input – The input text or messages to process

  • model_kwargs – Additional parameters including: - images: Optional image source(s) as path, URL, or list of them - detail: Image detail level (‘auto’, ‘low’, or ‘high’), defaults to ‘auto’ - model: The model to use (must support multimodal inputs if images are provided)

  • model_type – The type of model (EMBEDDER or LLM)

Returns:

API-specific kwargs for the model call

Return type:

Dict

parse_image_generation_response(response: List[Image]) GeneratorOutput[source]

Parse the image generation response into a GeneratorOutput.

call(api_kwargs: Dict = {}, model_type: ModelType = ModelType.UNDEFINED)[source]

kwargs is the combined input and model_kwargs. Support streaming call. For reasoning model, users can add “reasoning” key to the api_kwargs to pass the reasoning config. eg: model_kwargs = {

“model”: “gpt-4o-reasoning”, “reasoning”: {

“effort”: “medium”, # low, medium, highc “summary”: “auto”, #detailed, auto, none

}

}

async acall(api_kwargs: Dict = {}, model_type: ModelType = ModelType.UNDEFINED)[source]

kwargs is the combined input and model_kwargs. Support async streaming call.

This method now relies on the OpenAI Responses API to handle streaming and non-streaming calls with the asynchronous client

classmethod from_dict(data: Dict[str, Any]) T[source]

Create an instance from previously serialized data using to_dict() method.

to_dict() Dict[str, Any][source]

Convert the component to a dictionary.