tgd_optimizer¶

Text-grad optimizer and prompts. Also combined methods from ORPO llm optimizer.

With the auto-diff gradients, it made it possible to optimize any prompt parameter in a task pipeline.

https://arxiv.org/abs/2309.03409 Source code: https://github.com/google-deepmind/opro

Functions

extract_new_variable(text)

Classes

`HistoryPrompt`(id, value, eval_score[, ...])
`Instruction`(text, score[, responses, gts])	Structure variable values for instructions.
`TGDData`(reasoning, method[, proposed_variable])
`TGDOptimizer`(params, model_client[, ...])	Textual Gradient Descent(LLM) optimizer for text-based variables.
`TGDOptimizerTrace`([api_kwargs, output])

class HistoryPrompt(id: str, value: str, eval_score: float, method: str = None, reasoning: str = None)[source]¶

Bases: DataClass

id: str¶

value: str¶

eval_score: float¶

method: str = None¶

reasoning: str = None¶

class Instruction(text: str, score: float, responses: List[str] | None = None, gts: List[str] | None = None)[source]¶

Bases: DataClass

Structure variable values for instructions. Can be used in the history of instructions.

text: str¶

score: float¶

responses: List[str] | None = None¶

gts: List[str] | None = None¶

class TGDData(reasoning: str, method: str, proposed_variable: str = None)[source]¶

Bases: DataClass

reasoning: str¶

method: str¶

proposed_variable: str = None¶

class TGDOptimizerTrace(api_kwargs: Dict[str, Any] = None, output: optim.text_grad.tgd_optimizer.TGDData = None)[source]¶

Bases: DataClass

api_kwargs: Dict[str, Any] = None¶

output: TGDData = None¶

extract_new_variable(text: str) → str[source]¶

class TGDOptimizer(params: Iterable[Parameter] | Iterable[Dict[str, Any]], model_client: ModelClient, model_kwargs: Dict[str, object] = {}, constraints: List[str] = None, optimizer_system_prompt: str = 'You are an excellent prompt engineer tasked with instruction and demonstration tuning a compound LLM system.\nYour task is to refine a variable/prompt based on feedback from a batch of input data points.\n\nThe variable is either input or output of a functional component where the component schema will be provided.\nIf the same DataID has multiple gradients, it means this component/variable is called multiple times in the compound system(with a cycle) in the same order as it appears in the gradient list.\n\nYou Must edit the current variable with one of the following editing methods.\nYou can not rewrite everything all at once:\n\nYou have Four Editing Methods:\n1. ADD new elements(instruction) to address each specific feedback.\n2. ADD Examples (e.g., input-reasoning-answer) for tasks that require strong reasoning skills.\n3. Rephrase existing instruction(for more clarity), Replace existing sample with another, to address the feedback.\n4. DELETE unnecessary words to improve clarity.\n\nThese SIX prompting techniques can be a helpful direction.\n1. Set Context and Role: Establish a specific identity or domain expertise for the AI to guide style, knowledge, and constraints.\n2. Be Specific, Clear, and Grammarly correct: Clearly define instructions, desired format, and constraints to ensure accurate and relevant outputs with regards to the feedback.\n3. Illicit reasoning: "chain-of-thought" (e.g. "think step by step") helps the model reason better.\n4. Examples: Construct examples(e.g., input(optional)-reasoning(required)-answer) especially for tasks that require strong reasoning skills.\n5. Leverage Constraints and Formatting: Explicitly direct how the answer should be structured (e.g., bullet points, tables, or tone).\n6. Self-Consistency / Verification Prompts: Prompt the model to check its own logic for errors, inconsistencies, or missing details.\n\nYour final action/reasoning = one of FOUR editing method + one of SIX prompting technique.\n\nYou must stick to these instructions:\n1. **MUST Resolve concerns raised in the feedback** while preserving the positive aspects of the original variable.\n2. **Observe past performance patterns** to retain good qualities in the variable and past failed ones to try things differently.\n3. **System Awareness**: When other system variables are given, ensure you understand how this variable works in the whole system.\n4. **Peer Awareness**: This variable works together with Peer variables, ensure you are aware of their roles and constraints.\n5. **Batch Awareness**: You are optimizing a batch of input data, ensure the change applys to the whole batch (except while using demonstration.)\n\n{{output_format_str}}\n\n{% if instruction_to_optimizer %}\n**Additional User Instructions**: {{instruction_to_optimizer}}\n{% endif %}\n', in_context_examples: List[str] = None, max_past_history: int = 3, max_failed_proposals: int = 5, steps_from_last_improvement: int = 0, one_parameter_at_a_time: bool = True)[source]¶

Bases: TextOptimizer

Textual Gradient Descent(LLM) optimizer for text-based variables.

proposing: bool = False¶

params_history: Dict[str, List[HistoryPrompt]] = {}¶

failed_proposals: Dict[str, List[HistoryPrompt]] = {}¶

current_tgd_output: Dict[str, TGDData | None] = {}¶

params: Iterable[Parameter] | Iterable[Dict[str, Any]]¶

constraints: List[str]¶

one_parameter_at_a_time: bool¶

property constraint_text¶

Returns a formatted string representation of the constraints.

Returns:: A string containing the constraints in the format “Constraint {index}: {constraint}”.
Return type:: str

increment_steps_from_last_improvement()[source]¶

reset_steps_from_last_improvement()[source]¶

add_score_to_params(val_score: float)[source]¶

add_score_to_current_param(param_id: str, param: Parameter, score: float)[source]¶

add_history(param_id: str, history: HistoryPrompt)[source]¶

render_history(param_id: str) → List[str][source]¶

add_failed_proposal()[source]¶: Save a copy of the current value of the parameter in the failed proposals.

render_failed_proposals(param_id: str) → List[str][source]¶

update_gradient_memory(param: Parameter)[source]¶

zero_grad()[source]¶: Clear all the gradients of the parameters.

set_target_param()[source]¶

propose()[source]¶: Proposing a value while keeping previous value saved on parameter.

revert()[source]¶: Revert to the previous value when the evaluation is worse.

step()[source]¶: Discard the previous value and keep the proposed value.

to_dict()[source]¶