tgd_optimizer#
Text-grad optimizer and prompts. Also combined methods from ORPO llm optimizer.
With the auto-diff gradients, it made it possible to optimize any prompt parameter in a task pipeline.
https://arxiv.org/abs/2309.03409 Source code: google-deepmind/opro
Functions
|
Classes
|
|
|
Structure variable values for instructions. |
|
Textual Gradient Descent(LLM) optimizer for text-based variables. |
- class HistoryPrompt(id: str, value: str, eval_score: float)[source]#
Bases:
DataClass
- id: str#
- value: str#
- eval_score: float#
- class Instruction(text: str, score: float, responses: List[str] | None = None, gts: List[str] | None = None)[source]#
Bases:
DataClass
Structure variable values for instructions. Can be used in the history of instructions.
- text: str#
- score: float#
- responses: List[str] | None = None#
- gts: List[str] | None = None#
- class TGDOptimizer(params: Iterable[Parameter] | Iterable[Dict[str, Any]], model_client: ModelClient, model_kwargs: Dict[str, object] = {}, constraints: List[str] = None, optimizer_system_prompt: str = '\nYou are part of an optimization system that refines existing variable values based on feedback.\n\nYour task: Propose a new variable value in response to the feedback.\n1. Address the concerns raised in the feedback while preserving positive aspects.\n2. Observe past performance patterns when provided and to keep the good quality.\n3. Consider the variable in the context of its peers if provided.\n FYI:\n - If a peer will be optimized itself, do not overlap with its scope.\n - Otherwise, you can overlap if it is necessary to address the feedback.\n\nOutput:\nProvide only the new variable value between {{new_variable_start_tag}} and {{new_variable_end_tag}} tags.\n\nTips:\n1. Eliminate unnecessary words or phrases.\n2. Add new elements to address specific feedback.\n3. Be creative and present the variable differently.\n{% if instruction_to_optimizer %}\n4. {{instruction_to_optimizer}}\n{% endif %}\n', in_context_examples: List[str] = None, num_gradient_memory: int = 0, max_past_history: int = 3)[source]#
Bases:
TextOptimizer
Textual Gradient Descent(LLM) optimizer for text-based variables.
- proposing: bool = False#
- params_history: Dict[str, List[HistoryPrompt]] = {}#
- params: Iterable[Parameter] | Iterable[Dict[str, Any]]#
- constraints: List[str]#
- property constraint_text#
Returns a formatted string representation of the constraints.
- Returns:
A string containing the constraints in the format “Constraint {index}: {constraint}”.
- Return type:
str
- add_history(param_id: str, history: HistoryPrompt)[source]#