Tracing#

In particular, we provide two tracing methods to help you develop and improve the Generator:

1. Trace the history change(states) on prompt during your development process. Developers typically go through a long process of prompt optimization and it is frustrating to lose track of the prompt changes when your current change actually makes the performance much worse.

We created a GeneratorStateLogger to handle the logging and saving into json files. To further simplify developers’s process, we provides a class decorator trace_generator_states where a single line of code can be added to any of your task component. It will automatically track any attributes of type Generator.

from adalflow.tracing import trace_generator_states
from adalflow.core import Component, Generator

@trace_generator_states()
class SimpleQA(Component):
    def __init__(self):
        super().__init__()
        self.generator = Generator(...)
        self.generator_2 = Generator(...)
    def call(...):

In default, a dir from the current working directory will be created to store the log files. The project name in defaul is SimpleQA and the log file will be named as generator_state_trace.json where both the generator and generator_2 will be logged. The structure of log directory is as follows:

.
├── traces
│   ├── SimpleQA
│      ├── generator_state_trace.json

Here is an example log file:

{
    "generator": [
        {
            "prompt_states": {
                "_components": {},
                "_parameters": {},
                "training": false,
                "_template_string": "{# task desc #}\n{% if task_desc_str %}\n{{task_desc_str}}\n{% else %}\nAnswer user query.\n{% endif %}\n{# output format #}\n{% if output_format_str %}\n<OUTPUT_FORMAT>\n{{output_format_str}}\n</OUTPUT_FORMAT>\n{% endif %}\n{# tools #}\n{% if tools_str %}\n<TOOLS>\n{{tools_str}}\n</TOOLS>\n{% endif %}\n{# example #}\n{% if examples_str %}\n<EXAMPLES>\n{{examples_str}}\n</EXAMPLES>\n{% endif %}\n{# chat history #}\n{% if chat_history_str %}\n<CHAT_HISTORY>\n{{chat_history_str}}\n</CHAT_HISTORY>\n{% endif %}\n{#contex#}\n{% if context_str %}\n<CONTEXT>\n{{context_str}}\n</CONTEXT>\n{% endif %}\n{# steps #}\n{% if steps_str %}\n<STEPS>\n{{steps_str}}\n</STEPS>\n{% endif %}\n{% if input_str %}\n<Inputs>\n{{input_str}}\n</Inputs>\n{% endif %}\n{% if output_str %}\n<Outputs>\n{{output_str}}\n</Outputs>\n{% endif %}\n",
                "prompt_variables": [
                    "chat_history_str",
                    "context_str",
                    "examples_str",
                    "input_str",
                    "output_format_str",
                    "output_str",
                    "steps_str",
                    "task_desc_str",
                    "tools_str"
                ],
                "preset_prompt_kwargs": {
                    "task_desc_str": "You are a helpful assistant and with a great sense of humor."
                }
            },
            "time_stamp": "2024-06-02T15:55:21.765794"
        },
        {
            "prompt_states": {
                "_components": {},
                "_parameters": {},
                "training": false,
                "_template_string": "{# task desc #}\n{% if task_desc_str %}\n{{task_desc_str}}\n{% else %}\nAnswer user query.\n{% endif %}\n{# output format #}\n{% if output_format_str %}\n<OUTPUT_FORMAT>\n{{output_format_str}}\n</OUTPUT_FORMAT>\n{% endif %}\n{# tools #}\n{% if tools_str %}\n<TOOLS>\n{{tools_str}}\n</TOOLS>\n{% endif %}\n{# example #}\n{% if examples_str %}\n<EXAMPLES>\n{{examples_str}}\n</EXAMPLES>\n{% endif %}\n{# chat history #}\n{% if chat_history_str %}\n<CHAT_HISTORY>\n{{chat_history_str}}\n</CHAT_HISTORY>\n{% endif %}\n{#contex#}\n{% if context_str %}\n<CONTEXT>\n{{context_str}}\n</CONTEXT>\n{% endif %}\n{# steps #}\n{% if steps_str %}\n<STEPS>\n{{steps_str}}\n</STEPS>\n{% endif %}\n{% if input_str %}\n<Inputs>\n{{input_str}}\n</Inputs>\n{% endif %}\n{% if output_str %}\n<Outputs>\n{{output_str}}\n</Outputs>\n{% endif %}\n",
                "prompt_variables": [
                    "chat_history_str",
                    "context_str",
                    "examples_str",
                    "input_str",
                    "output_format_str",
                    "output_str",
                    "steps_str",
                    "task_desc_str",
                    "tools_str"
                ],
                "preset_prompt_kwargs": {
                    "task_desc_str": "You are a helpful assistant and with a great sense of humor. Second edition."
                }
            },
            "time_stamp": "2024-06-02T15:56:37.756148"
        }
    ],
    "generator2": [
    {
        "prompt_states": {
            "_components": {},
            "_parameters": {},
            "training": false,
            "_template_string": "{# task desc #}\n{% if task_desc_str %}\n{{task_desc_str}}\n{% else %}\nAnswer user query.\n{% endif %}\n{# output format #}\n{% if output_format_str %}\n<OUTPUT_FORMAT>\n{{output_format_str}}\n</OUTPUT_FORMAT>\n{% endif %}\n{# tools #}\n{% if tools_str %}\n<TOOLS>\n{{tools_str}}\n</TOOLS>\n{% endif %}\n{# example #}\n{% if examples_str %}\n<EXAMPLES>\n{{examples_str}}\n</EXAMPLES>\n{% endif %}\n{# chat history #}\n{% if chat_history_str %}\n<CHAT_HISTORY>\n{{chat_history_str}}\n</CHAT_HISTORY>\n{% endif %}\n{#contex#}\n{% if context_str %}\n<CONTEXT>\n{{context_str}}\n</CONTEXT>\n{% endif %}\n{# steps #}\n{% if steps_str %}\n<STEPS>\n{{steps_str}}\n</STEPS>\n{% endif %}\n{% if input_str %}\n<Inputs>\n{{input_str}}\n</Inputs>\n{% endif %}\n{% if output_str %}\n<Outputs>\n{{output_str}}\n</Outputs>\n{% endif %}\n",
            "prompt_variables": [
                "chat_history_str",
                "context_str",
                "examples_str",
                "input_str",
                "output_format_str",
                "output_str",
                "steps_str",
                "task_desc_str",
                "tools_str"
            ],
            "preset_prompt_kwargs": {
                "task_desc_str": "You are the second generator."
            }
        },
        "time_stamp": "2024-06-03T16:44:45.223220"
    }
]
}
  1. Trace all failed LLM predictions for further improvement.

Similarly, tracing.generator_call_logger.GeneratorCallLogger is created to log generator call input arguments and output results. trace_generator_call decorator is provided to provide one-line setup to trace calls, which in default will log only failed predictions.

Adding the second decorator to the above example:

from tracing import trace_generator_errors

@trace_generator_call()
@trace_generator_states()
class SimpleQA(Component):
    def __init__(self):
        super().__init__()
        self.generator = Generator(...)
        self.generator_2 = Generator(...)
    def call(...):

Now, three more files will be created in the log directory:

.
├── traces
│   ├── SimpleQA
│      ├── logger_metadata.json
│      ├── generator_call.jsonl
│      ├── generator_2_call.jsonl

The logger_metadata.json file contains the metadata of the logger, it looks like this:

{
    "generator": "./traces/SimpleQA/generator_call.jsonl",
    "generator2": "./traces/SimpleQA/generator2_call.jsonl"
}

The generator_call.jsonl file contains the log of all calls to the generator, it looks like this:

{"prompt_kwargs": {"input_str": "What is the capital of France?"}, "model_kwargs": {}, "output": {"data": "Bonjour!\n\nThe capital of France is Paris, of course! But did you know that the Eiffel Tower in Paris is actually the most-visited paid monument in the world? Mind-blowing, right?\n\nNow, would you like to know some more fun facts or perhaps ask another question? I'm all ears (or should I say, all eyes?)", "error_message": null, "raw_response": "Bonjour!\n\nThe capital of France is Paris, of course! But did you know that the Eiffel Tower in Paris is actually the most-visited paid monument in the world? Mind-blowing, right?\n\nNow, would you like to know some more fun facts or perhaps ask another question? I'm all ears (or should I say, all eyes?)"}, "time_stamp": "2024-06-03T16:44:45.582859"}

Note

Usually, let the evaluation run on evaluation to collect as much as failed predictions can be highly helpful for either manual prompting or auto-prompt engineering (APE).