openai_client¶
OpenAI ModelClient integration.
Functions
|
Estimate the token count of a given text. |
|
Used to extract the data field for the reasoning model |
|
Async generator that processes a stream of SSE events from client.responses.create(..., stream=True). |
|
Synchronous version: Iterate over an SSE stream from client.responses.create(..., stream=True), logging each raw event and yielding non-empty text fragments. |
|
Parse response output that may include various types of content and tool calls. |
Classes
|
A component wrapper for the OpenAI API client. |
|
Structured container for parsed response content from OpenAI Response API. |
- class ParsedResponseContent(text: str | None = None, images: str | List[str] | None = None, tool_calls: List[Dict[str, Any]] | None = None, reasoning: List[Dict[str, Any]] | None = None, code_outputs: List[Dict[str, Any]] | None = None, raw_output: Any | None = None)[source]¶
Bases:
object
Structured container for parsed response content from OpenAI Response API.
This dataclass provides a consistent interface for accessing different types of content that can be returned by the Response API, including text, images, tool calls, reasoning chains, and more.
- text¶
The main text content from the response
- Type:
str | None
- images¶
List of image data (base64 or URLs) from image generation
- Type:
str | List[str] | None
- tool_calls¶
List of other tool call results
- Type:
List[Dict[str, Any]] | None
- reasoning¶
Reasoning chain from reasoning models
- Type:
List[Dict[str, Any]] | None
- code_outputs¶
Outputs from code interpreter
- Type:
List[Dict[str, Any]] | None
- raw_output¶
The original output array for advanced processing
- Type:
Any | None
- text: str | None = None¶
- images: str | List[str] | None = None¶
- tool_calls: List[Dict[str, Any]] | None = None¶
- reasoning: List[Dict[str, Any]] | None = None¶
- code_outputs: List[Dict[str, Any]] | None = None¶
- raw_output: Any | None = None¶
- get_response_output_text(response: Response) str [source]¶
Used to extract the data field for the reasoning model
- parse_response_output(response: Response) ParsedResponseContent [source]¶
Parse response output that may include various types of content and tool calls.
The output array can contain: - Output messages (with nested content items) - Tool calls (file search, function, web search, computer use, etc.) - Reasoning chains - Image generation calls - Code interpreter calls - And more…
- Returns:
Structured content with typed access to all response data
- Return type:
- estimate_token_count(text: str) int [source]¶
Estimate the token count of a given text.
- Parameters:
text (str) – The text to estimate token count for.
- Returns:
Estimated token count.
- Return type:
int
- async handle_streaming_response(stream: AsyncIterable[Any]) AsyncGenerator[str, None] [source]¶
Async generator that processes a stream of SSE events from client.responses.create(…, stream=True).
- Parameters:
stream – An async iterable of SSE events from the OpenAI API
- Yields:
str – Non-empty text fragments parsed from the stream events
- handle_streaming_response_sync(stream: Iterable) Generator [source]¶
Synchronous version: Iterate over an SSE stream from client.responses.create(…, stream=True), logging each raw event and yielding non-empty text fragments.
- class OpenAIClient(api_key: str | None = None, non_streaming_chat_completion_parser: Callable[[Completion], Any] | None = None, streaming_chat_completion_parser: Callable[[Completion], Any] | None = None, non_streaming_response_parser: Callable[[Response], Any] | None = None, streaming_response_parser: Callable[[Response], Any] | None = None, input_type: Literal['text', 'messages'] = 'text', base_url: str = 'https://api.openai.com/v1/', env_api_key_name: str = 'OPENAI_API_KEY', organization: str | None = None, headers: Dict[str, str] | None = None)[source]¶
Bases:
ModelClient
A component wrapper for the OpenAI API client.
Support both embedding and response API, including multimodal capabilities.
Users (1) simplify use
Embedder
andGenerator
components by passing OpenAIClient() as the model_client. (2) can use this as an example to create their own API client or extend this class(copying and modifing the code) in their own project.Note
We suggest users not to use response_format to enforce output data type or tools and tool_choice in your model_kwargs when calling the API. We do not know how OpenAI is doing the formating or what prompt they have added. Instead - use OutputParser for response parsing and formating.
For multimodal inputs, provide images in model_kwargs[“images”] as a path, URL, or list of them. The model must support vision capabilities (e.g., gpt-4o, gpt-4o-mini, o1, o1-mini).
For image generation, use model_type=ModelType.IMAGE_GENERATION and provide: - model: “dall-e-3” or “dall-e-2” - prompt: Text description of the image to generate - size: “1024x1024”, “1024x1792”, or “1792x1024” for DALL-E 3; “256x256”, “512x512”, or “1024x1024” for DALL-E 2 - quality: “standard” or “hd” (DALL-E 3 only) - n: Number of images to generate (1 for DALL-E 3, 1-10 for DALL-E 2) - response_format: “url” or “b64_json”
Examples
Basic text generation:
from adalflow.components.model_client import OpenAIClient from adalflow.core import Generator # Initialize client (uses OPENAI_API_KEY env var by default) client = OpenAIClient() # Create a generator for text generator = Generator( model_client=client, model_kwargs={"model": "gpt-4o-mini"} ) # Generate response response = generator(prompt_kwargs={"input_str": "What is machine learning?"}) print(response.data)
Multimodal with URL image:
# Vision model with image from URL generator = Generator( model_client=OpenAIClient(), model_kwargs={ "model": "gpt-4o", "images": "https://example.com/chart.jpg" } ) response = generator( prompt_kwargs={"input_str": "Analyze this chart and explain the trends"} )
Multimodal with local images:
# Multiple local images generator = Generator( model_client=OpenAIClient(), model_kwargs={ "model": "gpt-4o", "images": [ "/path/to/image1.jpg", "/path/to/image2.png" ] } ) response = generator( prompt_kwargs={"input_str": "Compare these two images"} )
Pre-formatted images with custom encoding:
import base64 from adalflow.core.functional import encode_image # Option 1: Using the encode_image helper base64_img = encode_image("/path/to/image.jpg") # Option 2: Manual base64 encoding with open("/path/to/image.png", "rb") as f: base64_img = base64.b64encode(f.read()).decode('utf-8') # Use pre-formatted image data generator = Generator( model_client=OpenAIClient(), model_kwargs={ "model": "gpt-4o", "images": [ # Pre-formatted as base64 data URI f"data:image/png;base64,{base64_img}", # Or as a dict with type and image_url { "type": "input_image", "image_url": f"data:image/jpeg;base64,{base64_img}" }, # Mix with regular URLs "https://example.com/chart.jpg" ] } ) response = generator( prompt_kwargs={"input_str": "Analyze these images"} )
Reasoning models (O1, O3):
from adalflow.core.types import ModelType # O3 reasoning model with effort configuration generator = Generator( model_client=OpenAIClient(), model_type=ModelType.LLM_REASONING, model_kwargs={ "model": "o3", "reasoning": { "effort": "medium", # low, medium, high "summary": "auto" # detailed, auto, none } } ) response = generator( prompt_kwargs={"input_str": "Solve this complex problem: ..."} )
Image generation with DALL-E (legacy method):
from adalflow.core.types import ModelType # Generate an image using ModelType.IMAGE_GENERATION generator = Generator( model_client=OpenAIClient(), model_type=ModelType.IMAGE_GENERATION, model_kwargs={ "model": "dall-e-3", "size": "1024x1792", "quality": "hd", "n": 1 } ) response = generator( prompt_kwargs={"input_str": "A futuristic city with flying cars at sunset"} ) # response.data contains the image URL or base64 data
Image generation via tools (new API):
import base64 # Generate images using the new tools API generator = Generator( model_client=OpenAIClient(), model_kwargs={ "model": "gpt-4o-mini", # or any model that supports tools "tools": [{"type": "image_generation"}] } ) # Generate an image response = generator( prompt_kwargs={ "input_str": "Generate an image of a gray tabby cat hugging an otter with an orange scarf" } ) # Access the generated image(s) if isinstance(response.data, list): # Multiple images for i, img_base64 in enumerate(response.data): with open(f"generated_{i}.png", "wb") as f: f.write(base64.b64decode(img_base64)) elif isinstance(response.data, str): # Single image with open("generated.png", "wb") as f: f.write(base64.b64decode(response.data)) elif isinstance(response.data, dict) and "images" in response.data: # Mixed response with text and images print("Text:", response.data["text"]) for i, img_base64 in enumerate(response.data["images"]): with open(f"generated_{i}.png", "wb") as f: f.write(base64.b64decode(img_base64))
Embeddings:
from adalflow.core import Embedder # Create embedder embedder = Embedder( model_client=OpenAIClient(), model_kwargs={"model": "text-embedding-3-small"} ) # Generate embeddings embeddings = embedder(input=["Hello world", "Machine learning"]) print(embeddings.data) # List of embedding vectors
Streaming responses:
from adalflow.components.model_client.utils import extract_text_from_response_stream # Enable streaming generator = Generator( model_client=OpenAIClient(), model_kwargs={ "model": "gpt-4o", "stream": True } ) # Stream the response response = generator(prompt_kwargs={"input_str": "Tell me a story"}) # Extract text from Response API streaming events for event in response.raw_response: text = extract_text_from_response_stream(event) if text: print(text, end="")
Custom API endpoint:
# Use with third-party providers or local models client = OpenAIClient( base_url="https://api.custom-provider.com/v1/", api_key="your-api-key", headers={"X-Custom-Header": "value"} )
- Parameters:
api_key (Optional[str], optional) – OpenAI API key. Defaults to None.
non_streaming_chat_completion_parser (Callable[[Completion], Any], optional) – Legacy parser for chat completions. Defaults to None (deprecated).
streaming_chat_completion_parser (Callable[[Completion], Any], optional) – Legacy parser for streaming chat completions. Defaults to None (deprecated).
non_streaming_response_parser (Callable[[Response], Any], optional) – The parser for non-streaming responses. Defaults to get_response_output_text.
streaming_response_parser (Callable[[Response], Any], optional) – The parser for streaming responses. Defaults to handle_streaming_response.
input_type (Literal["text", "messages"]) – Input type for the client. Defaults to “text”.
base_url (str) – The API base URL to use when initializing the client. Defaults to “https://api.openai.com/v1/”, but can be customized for third-party API providers or self-hosted models.
env_api_key_name (str) – The environment variable name for the API key. Defaults to “OPENAI_API_KEY”.
organization (Optional[str], optional) – OpenAI organization key. Defaults to None.
headers (Optional[Dict[str, str]], optional) – Additional headers to include in API requests. Defaults to None.
References
OpenAI API Overview: https://platform.openai.com/docs/introduction, https://platform.openai.com/docs/guides/images-vision?api-mode=responses
Embeddings Guide: https://platform.openai.com/docs/guides/embeddings
Chat Completion Models: https://platform.openai.com/docs/guides/text-generation
Response api: https://platform.openai.com/docs/api-reference/responses/create, Analyze images and use them as input and/or generate images as output
Vision Models: https://platform.openai.com/docs/guides/vision
Image Generation: https://platform.openai.com/docs/guides/images
reasoning: https://platform.openai.com/docs/guides/reasoning
Note
Ensure each OpenAIClient instance is used by one generator only.
- parse_chat_completion(completion: Response | AsyncIterable) GeneratorOutput [source]¶
Parse the Response API completion and put it into the raw_response. Fully migrated to Response API only.
- track_completion_usage(completion: Response | AsyncIterable) ResponseUsage [source]¶
Track usage for Response API only.
- parse_embedding_response(response: CreateEmbeddingResponse) EmbedderOutput [source]¶
Parse the embedding response to a structure Adalflow components can understand.
Should be called in
Embedder
.
- convert_inputs_to_api_kwargs(input: Any | None = None, model_kwargs: Dict = {}, model_type: ModelType = ModelType.UNDEFINED) Dict [source]¶
Specify the API input type and output api_kwargs that will be used in _call and _acall methods. Convert the Component’s standard input, and system_input(chat model) and model_kwargs into API-specific format. For multimodal inputs, images can be provided in model_kwargs[“images”] as a string path, URL, or list of them. The model specified in model_kwargs[“model”] must support multimodal capabilities when using images.
- Parameters:
input – The input text or messages to process
model_kwargs – Additional parameters including: - images: Optional image source(s) as path, URL, or list of them - detail: Image detail level (‘auto’, ‘low’, or ‘high’), defaults to ‘auto’ - model: The model to use (must support multimodal inputs if images are provided)
model_type – The type of model (EMBEDDER or LLM)
- Returns:
API-specific kwargs for the model call
- Return type:
Dict
- parse_image_generation_response(response: List[Image]) GeneratorOutput [source]¶
Parse the image generation response into a GeneratorOutput.
- call(api_kwargs: Dict = {}, model_type: ModelType = ModelType.UNDEFINED)[source]¶
kwargs is the combined input and model_kwargs. Support streaming call. For reasoning model, users can add “reasoning” key to the api_kwargs to pass the reasoning config. eg: model_kwargs = {
“model”: “gpt-4o-reasoning”, “reasoning”: {
“effort”: “medium”, # low, medium, highc “summary”: “auto”, #detailed, auto, none
}
}
- async acall(api_kwargs: Dict = {}, model_type: ModelType = ModelType.UNDEFINED)[source]¶
kwargs is the combined input and model_kwargs. Support async streaming call.
This method now relies on the OpenAI Responses API to handle streaming and non-streaming calls with the asynchronous client