Optimization#

Base Classes and Data Structures#

The GradComponent and LossComponent are a subclass from Component to serve the purpose to differentiate the gradient and loss components in the optimization process. And it will be used if users want to implement their own with more customization.

optim.parameter

Parameter is used by Optimizer, Trainers, AdalComponent to auto-optimizations

optim.optimizer

Base Classes for AdalFlow Optimizers, including Optimizer, TextOptimizer, and DemoOptimizer.

optim.grad_component

Base class for Autograd Components that can be called and backpropagated through.

optim.loss_component

Base class for Autograd Components that can be called and backpropagated through.

optim.types

All data types used by Parameter, Optimizer, AdalComponent, and Trainer.

Few Shot Optimizer#

optim.few_shot.bootstrap_optimizer

Adapted and optimized boostrap fewshot optimizer:

Textual Gradient#

optim.text_grad.llm_text_loss

Implementation of TextGrad: Automatic “Differentiation” via Text

optim.text_grad.text_loss_with_eval_fn

Adapted from text_grad's String Based Function

optim.text_grad.ops

Text-grad operations such as Sum and Aggregate.

optim.text_grad.tgd_optimizer

Text-grad optimizer and prompts.

Trainer and AdalComponent#

optim.trainer.adal

AdalComponent provides an interface to compose different parts, from eval_fn, train_step, loss_step, optimizers, backward engine, teacher generator, etc to work with Trainer.

optim.trainer.trainer

Ready to use trainer for LLM task pipeline

Overview#

class Optimizer[source]#

Bases: object

Base class for all optimizers.

proposing: bool = False#
params: Iterable[Parameter] | Iterable[Dict[str, Any]]#
state_dict()[source]#
propose(*args, **kwargs)[source]#
step(*args, **kwargs)[source]#
revert(*args, **kwargs)[source]#
class RandomSampler(dataset: Sequence[T_co] | None = None, default_num_shots: int | None = None)[source]#

Bases: Sampler, Generic[T_co]

Simple random sampler to sample from the dataset.

set_dataset(dataset: Sequence[T_co])[source]#

Set the dataset for the sampler

random_replace(shots: int, samples: List[Sample[T_co]], replace: bool | None = False) List[Sample[T_co]][source]#

Randomly replace num of shots in the samples.

If replace is True, it will skip duplicate checks

random_sample(shots: int, replace: bool | None = False) List[Sample][source]#

Randomly sample num of shots from the dataset. If replace is True, sample with replacement, meaning the same sample can be sampled multiple times.

call(num_shots: int | None = None, replace: bool | None = False) List[Sample][source]#

Abstract method to do the main sampling

class ClassSampler(dataset: Sequence[T_co], num_classes: int, get_data_key_fun: Callable, default_num_shots: int | None = None)[source]#

Bases: Sampler, Generic[T_co]

Sample from the dataset based on the class labels.

T_co can be any type of data, e.g., dict, list, etc. with get_data_key_fun to extract the class label.

Example: Initialize ` dataset = [{"coarse_label": i} for i in range(10)] sampler = ClassSampler[Dict](dataset, num_classes=6, get_data_key_fun=lambda x: x["coarse_label"]) `

random_replace(shots: int, samples: List[Sample], replace: bool | None = False, weights_per_class: List[float] | None = None) Sequence[Sample[T_co]][source]#

Randomly select num shots from the samples and replace it with another sample has the same class index

random_sample(num_shots: int, replace: bool | None = False) List[Sample[T_co]][source]#

Randomly sample num_shots from the dataset. If replace is True, sample with replacement.

call(num_shots: int, replace: bool | None = False) List[Sample[T_co]][source]#

Sample num_shots from the dataset. If replace is True, sample with replacement.

class Sampler(*args, **kwargs)[source]#

Bases: Generic[T_co]

dataset: Sequence[object] = None#
set_dataset(dataset: Sequence[T_co])[source]#

Set the dataset for the sampler

random_replace(*args, **kwargs)[source]#

Randomly replace some samples

You can have two arguments, e.g., shots and samples, or shots, samples, and replace.

call(*args, **kwargs) List[Sample[T_co]][source]#

Abstract method to do the main sampling

class Parameter(*, id: str | None = None, data: ~optim.parameter.T = None, requires_opt: bool = True, role_desc: str = '', param_type: ~adalflow.optim.types.ParameterType = <ParameterType.PROMPT: prompt, 'Instruction to the language model on task, data, and format.'>, name: str = None, gradient_prompt: str = None, raw_response: str = None, instruction_to_optimizer: str = None, instruction_to_backward_engine: str = None, score: float | None = None, eval_input: object = None, from_response_id: str | None = None, successor_map_fn: ~typing.Dict[str, ~typing.Callable] | None = None)[source]#

Bases: Generic[T]

A data container to represent a parameter used for optimization.

A parameter enforce a specific data type and can be updated in-place. When parameters are used in a component - when they are assigned as Component attributes they are automatically added to the list of its parameters, and will appear in the parameters() or named_parameters() method.

Args:

End users only need to create the Parameter with four arguments and pass it to the prompt_kwargs in the Generator.

  • data (str): the data of the parameter

  • requires_opt (bool, optional): if the parameter requires optimization. Default: True

  • role_desc.

  • param_type, incuding ParameterType.PROMPT for instruction optimization, ParameterType.DEMOS

for few-shot optimization. - instruction_to_optimizer (str, optional): instruction to the optimizer. Default: None - instruction_to_backward_engine (str, optional): instruction to the backward engine. Default: None

The parameter users created will be automatically assigned to the variable_name/key in the prompt_kwargs for easy reading and debugging in the trace_graph.

References:

  1. karpathy/micrograd

proposing: bool = False#
predecessors: Set[Parameter] = {}#
peers: Set[Parameter] = {}#
input_args: Dict[str, Any] = None#
full_response: object = None#
backward_engine_disabled: bool = False#
id: str = None#
role_desc: str = ''#
name: str = None#
param_type: ParameterType#
data: T = None#
eval_input: object = None#
from_response_id: str = None#
successor_map_fn: Dict[str, Callable] = None#
map_to_successor(successor: object) T[source]#

Apply the map function to the successor based on the successor’s id.

add_successor_map_fn(successor: object, map_fn: Callable)[source]#

Add or update a map function for a specific successor using its id.

check_if_already_computed_gradient_respect_to(response_id: str) bool[source]#
add_gradient(gradient: Parameter)[source]#
set_predecessors(predecessors: List[Parameter] = None)[source]#
set_grad_fn(grad_fn)[source]#
get_param_info()[source]#
set_peers(peers: List[Parameter] = None)[source]#
trace_forward_pass(input_args: Dict[str, Any], full_response: object)[source]#

Trace the forward pass of the parameter.

set_eval_fn_input(eval_input: object)[source]#

Set the input for the eval_fn.

set_score(score: float)[source]#
add_to_trace(trace: DataClass, is_teacher: bool = True)[source]#

Called by the generator.forward to add a trace to the parameter.

It is important to allow updating to the trace, as this will give different sampling weight. If the score increases as the training going on, it will become less likely to be sampled, allowing the samples to be more diverse. Or else, it will keep sampling failed examples.

add_score_to_trace(trace_id: str, score: float, is_teacher: bool = True)[source]#

Called by the generator.backward to add the eval score to the trace.

propose_data(data: T, demos: List[DataClass] | None = None)[source]#

Used by optimizer to put the new data, and save the previous data in case of revert.

revert_data(include_demos: bool = False)[source]#

Revert the data to the previous data.

step_data(include_demos: bool = False)[source]#

Use PyTorch’s optimizer syntax to finalize the update of the data.

get_grad_fn()[source]#
update_value(data: T)[source]#

Update the parameter’s value in-place, checking for type correctness.

reset_gradients()[source]#
reset_gradients_context()[source]#
get_gradients_names() str[source]#
get_gradient_and_context_text() str[source]#

Aggregates and returns: 1. the gradients 2. the context text for which the gradients are computed

get_short_value(n_words_offset: int = 10) str[source]#

Returns a short version of the value of the variable. We sometimes use it during optimization, when we want to see the value of the variable, but don’t want to see the entire value. This is sometimes to save tokens, sometimes to reduce repeating very long variables, such as code or solutions to hard problems. :param n_words_offset: The number of words to show from the beginning and the end of the value. :type n_words_offset: int

static trace_graph(root: Parameter) Tuple[Set[Parameter], Set[Tuple[Parameter, Parameter]]][source]#
backward()[source]#
draw_graph(add_grads: bool = True, format: Literal['png', 'svg'] = 'png', rankdir: Literal['LR', 'TB'] = 'TB', filepath: str | None = None)[source]#

Draw the graph of the parameter and its gradients.

Parameters:
  • add_grads (bool, optional) – Whether to add gradients to the graph. Defaults to True.

  • format (str, optional) – The format of the output file. Defaults to “png”.

  • rankdir (str, optional) – The direction of the graph. Defaults to “TB”.

  • filepath (str, optional) – The path to save the graph. Defaults to None.

to_dict()[source]#
classmethod from_dict(data: dict)[source]#
class BackwardContext(backward_fn: Callable, backward_engine: BackwardEngine = None, *args, **kwargs)[source]#

Bases: object

Represents a context for backward computation.

Parameters:
  • backward_fn (callable) – The backward function to be called during backward computation.

  • args – Variable length argument list to be passed to the backward function.

  • kwargs – Arbitrary keyword arguments to be passed to the backward function.

Variables:
  • backward_fn (callable) – The backward function to be called during backward computation.

  • fn_name (str) – The fully qualified name of the backward function.

  • args – Variable length argument list to be passed to the backward function.

  • kwargs – Arbitrary keyword arguments to be passed to the backward function.

Method __call__(backward_engine:

EngineLM) -> Any: Calls the backward function with the given backward engine and returns the result.

Method __repr__() -> str:

Returns a string representation of the BackwardContext object.

class BootstrapFewShot(params: List[Parameter], raw_shots: int | None = None, bootstrap_shots: int | None = None, dataset: List[DataClass] | None = None, weighted: bool = True, exclude_input_fields_from_bootstrap_demos: bool = False)[source]#

Bases: DemoOptimizer

BootstrapFewShot performs few-shot sampling used in few-shot ICL.

It will be used to optimize paramters of demos. Based on research from AdalFlow team and DsPy library.

Compared with Dspy’s version:
  1. we added weighted sampling for both the raw and augmented demos to prioritize failed demos but successful in augmented demos based on the evaluation score while we backpropagate the demo samples.

  2. In default, we exclude the input fields from the augmented demos. Our reserch finds that using the reasoning demostrations from teacher model can be more effective in some cases than taking both inputs and output samples and be more token efficient.

Reference: - DsPy: Com-piling declarative language model calls into state-of-the-art pipelines.

add_scores(ids: List[str], scores: List[float], is_teacher: bool = True)[source]#
config_shots(raw_shots: int, bootstrap_shots: int)[source]#

Initialize the samples for each parameter.

config_dataset(dataset: List[DataClass])[source]#
property num_shots: int#
sample(augmented_demos: Dict[str, DataClass], demos: Dict[str, DataClass], dataset: List[DataClass], raw_shots: int, bootstrap_shots: int, weighted: bool = True)[source]#

Performs weighted sampling, ensure the score is in range [0, 1]. The higher score means better accuracy.

static samples_to_str(samples: List[DataClass], augmented: bool = False, exclude_inputs: bool = False) str[source]#
propose()[source]#

Proposing a value while keeping previous value saved on parameter.

revert()[source]#

Revert to the previous value when the evaluation is worse.

step()[source]#

Discard the previous value and keep the proposed value.

class TGDOptimizer(params: Iterable[Parameter] | Iterable[Dict[str, Any]], model_client: ModelClient, model_kwargs: Dict[str, object] = {}, constraints: List[str] = None, optimizer_system_prompt: str = '\nYou are part of an optimization system that refines existing variable values based on feedback.\n\nYour task: Propose a new variable value in response to the feedback.\n1. Address the concerns raised in the feedback while preserving positive aspects.\n2. Observe past performance patterns when provided and to keep the good quality.\n3. Consider the variable in the context of its peers if provided.\n   FYI:\n   - If a peer will be optimized itself, do not overlap with its scope.\n   - Otherwise, you can overlap if it is necessary to address the feedback.\n\nOutput:\nProvide only the new variable value between {{new_variable_start_tag}} and {{new_variable_end_tag}} tags.\n\nTips:\n1. Eliminate unnecessary words or phrases.\n2. Add new elements to address specific feedback.\n3. Be creative and present the variable differently.\n{% if instruction_to_optimizer %}\n4. {{instruction_to_optimizer}}\n{% endif %}\n', in_context_examples: List[str] = None, num_gradient_memory: int = 0, max_past_history: int = 3)[source]#

Bases: TextOptimizer

Textual Gradient Descent(LLM) optimizer for text-based variables.

proposing: bool = False#
params_history: Dict[str, List[HistoryPrompt]] = {}#
params: Iterable[Parameter] | Iterable[Dict[str, Any]]#
constraints: List[str]#
property constraint_text#

Returns a formatted string representation of the constraints.

Returns:

A string containing the constraints in the format “Constraint {index}: {constraint}”.

Return type:

str

add_score_to_params(val_score: float)[source]#
add_score_to_current_param(param_id: str, param: Parameter, score: float)[source]#
add_history(param_id: str, history: HistoryPrompt)[source]#
render_history(param_id: str) List[str][source]#
get_gradient_memory_text(param: Parameter) str[source]#
update_gradient_memory(param: Parameter)[source]#
zero_grad()[source]#

Clear all the gradients of the parameters.

propose()[source]#

Proposing a value while keeping previous value saved on parameter.

revert()[source]#

Revert to the previous value when the evaluation is worse.

step()[source]#

Discard the previous value and keep the proposed value.

class EvalFnToTextLoss(eval_fn: Callable | BaseEvaluator, eval_fn_desc: str, backward_engine: BackwardEngine | None = None, model_client: ModelClient = None, model_kwargs: Dict[str, object] = None)[source]#

Bases: LossComponent

Convert an evaluation function to a text loss.

LossComponent will take an eval function and output a score (usually a float in range [0, 1], and the higher the better, unlike the loss function in model training).

In math:

score/loss = eval_fn(y_pred, y_gt)

The gradident/feedback = d(score)/d(y_pred) will be computed using a backward engine. Gradient_context = GradientContext(

context=conversation_str, response_desc=response.role_desc, variable_desc=role_desc,

)

Parameters:
  • eval_fn – The evaluation function that takes a pair of y and y_gt and returns a score.

  • eval_fn_desc – Description of the evaluation function.

  • backward_engine – The backward engine to use for the text prompt optimization.

  • model_client – The model client to use for the backward engine if backward_engine is not provided.

  • model_kwargs – The model kwargs to use for the backward engine if backward_engine is not provided.

forward(kwargs: Dict[str, Parameter], response_desc: str = None, metadata: Dict[str, str] = None) Parameter[source]#

Default just wraps the call method.

set_backward_engine(backward_engine: BackwardEngine = None, model_client: ModelClient = None, model_kwargs: Dict[str, object] = None)[source]#
backward(response: Parameter, eval_fn_desc: str, kwargs: Dict[str, Parameter], backward_engine: BackwardEngine | None = None, metadata: Dict[str, str] = None)[source]#

Ensure to set backward_engine for the text prompt optimization. It can be None if you are only doing demo optimization and it will not have gradients but simply backpropagate the score.

class LLMAsTextLoss(prompt_kwargs: Dict[str, str | Parameter], model_client: ModelClient, model_kwargs: Dict[str, object])[source]#

Bases: LossComponent

Evaluate the final RAG response using an LLM judge.

The LLM judge will have: - eval_system_prompt: The system prompt to evaluate the response. - y_hat: The response to evaluate. - Optional: y: The correct response to compare against.

The loss will be a Parameter with the evaluation result and can be used to compute gradients. This loss use LLM/Generator as the computation/transformation operator, so it’s gradient will be found from the Generator’s backward method.

forward(*args, **kwargs) Parameter[source]#

Default just wraps the call method.

class Trainer(adaltask: AdalComponent, optimization_order: Literal['sequential', 'mix'] = 'sequential', strategy: Literal['random', 'constrained'] = 'constrained', max_steps: int = 1000, train_batch_size: int | None = 4, num_workers: int = 4, ckpt_path: str = None, batch_val_score_threshold: float | None = 1.0, max_error_samples: int | None = 4, max_correct_samples: int | None = 4, max_proposals_per_step: int = 5, train_loader: Any | None = None, train_dataset: Any | None = None, val_dataset: Any | None = None, test_dataset: Any | None = None, raw_shots: int | None = None, bootstrap_shots: int | None = None, weighted_sampling: bool = False, exclude_input_fields_from_bootstrap_demos: bool = False, debug: bool = False, save_traces: bool = False, *args, **kwargs)[source]#

Bases: Component

Ready to use trainer for LLM task pipeline to optimize all types of parameters.

Training set: can be used for passing initial proposed prompt or for few-shot sampling. Validation set: Will be used to select the final prompt or samples. Test set: Will be used to evaluate the final prompt or samples.

Parameters:
  • adaltask – AdalComponent: AdalComponent instance

  • strategy – Literal[“random”, “constrained”]: Strategy to use for the optimizer

  • max_steps – int: Maximum number of steps to run the optimizer

  • num_workers – int: Number of workers to use for parallel processing

  • ckpt_path – str: Path to save the checkpoint files, default to ~/.adalflow/ckpt.

  • batch_val_score_threshold – Optional[float]: Threshold for skipping a batch

  • max_error_samples – Optional[int]: Maximum number of error samples to keep

  • max_correct_samples – Optional[int]: Maximum number of correct samples to keep

  • max_proposals_per_step – int: Maximum number of proposals to generate per step

  • train_loader – Any: DataLoader instance for training

  • train_dataset – Any: Training dataset

  • val_dataset – Any: Validation dataset

  • test_dataset – Any: Test dataset

  • few_shots_config – Optional[FewShotConfig]: Few shot configuration

  • save_traces – bool: Save traces for for synthetic data generation or debugging

optimizer: Optimizer = None#
ckpt_file: str | None = None#
optimization_order: Literal['sequential', 'mix'] = 'sequential'#
strategy: Literal['random', 'constrained']#
max_steps: int#
ckpt_path: str | None = None#
adaltask: AdalComponent#
num_workers: int = 4#
train_loader: Any#
val_dataset = None#
test_dataset = None#
batch_val_score_threshold: float | None = 1.0#
max_error_samples: int | None = 8#
max_correct_samples: int | None = 8#
max_proposals_per_step: int = 5#
train_batch_size: int | None = 4#
debug: bool = False#
diagnose(dataset: Any, split: str = 'train')[source]#

Run an evaluation on the trainset to track all error response, and its raw response using AdaplComponent’s default configure_callbacks :param dataset: Any: Dataset to evaluate :param split: str: Split name, default to train and it is also used as set the directory name for saving the logs

Example:

trainset, valset, testset = load_datasets(max_samples=10)
adaltask = TGDWithEvalFnLoss(
    task_model_config=llama3_model,
    backward_engine_model_config=llama3_model,
    optimizer_model_config=llama3_model,
)

trainer = Trainer(adaltask=adaltask)
diagnose = trainer.diagnose(dataset=trainset)
print(diagnose)
fit(*, adaltask: AdalComponent | None = None, train_loader: Any | None = None, train_dataset: Any | None = None, val_dataset: Any | None = None, test_dataset: Any | None = None, debug: bool = False, save_traces: bool = False, raw_shots: int | None = None, bootstrap_shots: int | None = None, resume_from_ckpt: str | None = None)[source]#

train_loader: An iterable or collection of iterables specifying training samples.

initial_validation(val_dataset: Any, test_dataset: Any)[source]#
gather_trainer_states()[source]#
prep_ckpt_file_path(trainer_state: Dict[str, Any] = None)[source]#

Prepare the checkpoint root path: ~/.adalflow/ckpt/task_name/.

It also generates a unique checkpoint file name based on the strategy, max_steps, and a unique hash key. For multiple runs but with the same adalcomponent + trainer setup, the run number will be incremented.

class AdalComponent(task: Component, eval_fn: Callable | None = None, loss_fn: LossComponent | None = None, backward_engine: BackwardEngine | None = None, backward_engine_model_config: Dict | None = None, teacher_model_config: Dict | None = None, text_optimizer_model_config: Dict | None = None, *args, **kwargs)[source]#

Bases: Component

Define a train, eval, and test step for a task pipeline.

This serves the following purposes: 1. Organize all parts for training a task pipeline in one place. 2. Help with debugging and testing before the actual training. 3. Adds multi-threading support for training and evaluation.

task: Component#
eval_fn: Callable | None#
loss_fn: LossComponent | None#
backward_engine: BackwardEngine | None#
handle_one_task_sample(sample: Any, *args, **kwargs) Tuple[Callable, Dict][source]#

Return a task call and kwargs for one training sample.

Example:

def handle_one_task_sample(self, sample: Any, *args, **kwargs) -> Tuple[Callable, Dict]:
    return self.task, {"x": sample.x}
handle_one_loss_sample(sample: Any, y_pred: Parameter, *args, **kwargs) Tuple[Callable, Dict][source]#

Return a loss call and kwargs for one loss sample.

Need to ensure y_pred is a Parameter, and the real input to use for y_gt and y_pred is eval_input. Make sure it is setup.

Example:

# "y" and "y_gt" are arguments needed
#by the eval_fn inside of the loss_fn if it is a EvalFnToTextLoss

def handle_one_loss_sample(self, sample: Example, pred: adal.Parameter) -> Dict:
    # prepare gt parameter
    y_gt = adal.Parameter(
        name="y_gt",
        data=sample.answer,
        eval_input=sample.answer,
        requires_opt=False,
    )

    # pred's full_response is the output of the task pipeline which is GeneratorOutput
    pred.eval_input = pred.full_response.data
    return self.loss_fn, {"kwargs": {"y": y_gt, "y_pred": pred}}
evaluate_one_sample(sample: Any, y_pred: Any, *args, **kwargs) float[source]#

Used to evaluate a single sample. Return a score in range [0, 1]. The higher the score the better the prediction.

configure_optimizers(*args, **kwargs) List[Optimizer][source]#

Note: When you use text optimizor, ensure you call configure_backward_engine_engine too.

configure_backward_engine(*args, **kwargs)[source]#

Configure a backward engine for all generators in the task for bootstrapping examples.

evaluate_samples(samples: Any, y_preds: List, metadata: Dict[str, Any] | None = None, num_workers: int = 2) EvaluationResult[source]#

Run evaluation on samples using parallel processing. Utilizes evaluate_one_sample defined by the user.

Metadata is used for storing context that you can find from generator input.

Parameters:
  • samples (Any) – The input samples to evaluate.

  • y_preds (List) – The predicted outputs corresponding to each sample.

  • metadata (Optional[Dict[str, Any]]) – Optional metadata dictionary.

  • num_workers (int) – Number of worker threads for parallel processing.

Returns:

An object containing the average score and per-item scores.

Return type:

EvaluationResult

pred_step(batch, batch_idx, num_workers: int = 2, running_eval: bool = False, min_score: float | None = None)[source]#

Applies to both train and eval mode.

If you require self.task.train() to be called before training, you can override this method as:

def train_step(self, batch, batch_idx, num_workers: int = 2) -> List:
    self.task.train()
    return super().train_step(batch, batch_idx, num_workers)
train_step(batch, batch_idx, num_workers: int = 2) List[source]#
validate_condition(steps: int, total_steps: int) bool[source]#

In default, trainer will validate at every step.

validation_step(batch, batch_idx, num_workers: int = 2, minimum_score: float | None = None) EvaluationResult[source]#

If you require self.task.eval() to be called before validation, you can override this method as:

def validation_step(self, batch, batch_idx, num_workers: int = 2) -> List:
    self.task.eval()
    return super().validation_step(batch, batch_idx, num_workers)
loss_step(batch, y_preds: List[Parameter], batch_idx, num_workers: int = 2) List[Parameter][source]#

Calculate the loss for the batch.

configure_teacher_generator()[source]#

Configure a teach generator for all generators in the task for bootstrapping examples.

You can call configure_teacher_generator_helper to easily configure it by passing the model_client and model_kwargs.

configure_teacher_generator_helper(model_client: ModelClient, model_kwargs: Dict[str, Any], template: str | None = None)[source]#

Configure a teach generator for all generators in the task for bootstrapping examples.

configure_backward_engine_helper(model_client: ModelClient, model_kwargs: Dict[str, Any], template: str | None = None)[source]#

Configure a backward engine for all generators in the task for bootstrapping examples.

configure_callbacks(save_dir: str | None = 'traces', *args, **kwargs)[source]#

In default we config the failure generator callback. User can overwrite this method to add more callbacks.

run_one_task_sample(sample: Any) Any[source]#

Run one training sample. Used for debugging and testing.

run_one_loss_sample(sample: Any, y_pred: Any) Any[source]#

Run one loss sample. Used for debugging and testing.

configure_demo_optimizer_helper() List[DemoOptimizer][source]#

One demo optimizer can handle multiple demo parameters. But the demo optimizer will only have one dataset (trainset) configured by the Trainer.

If users want to use different trainset for different demo optimizer, they can configure it by themselves.

configure_text_optimizer_helper(model_client: ModelClient, model_kwargs: Dict[str, Any]) List[TextOptimizer][source]#

One text optimizer can handle multiple text parameters.

class DemoOptimizer(weighted: bool = True, dataset: Sequence[DataClass] = None, exclude_input_fields_from_bootstrap_demos: bool = False, *args, **kwargs)[source]#

Bases: Optimizer

Base class for all demo optimizers.

Demo optimizer are few-shot optimization, where it will sample raw examples from train dataset or bootstrap examples from the model’s output. It will work with a sampler to generate new values for a given text prompt.

If bootstrap is used, it will require a teacher genearator to generate the examples.

dataset: Sequence[DataClass]#
exclude_input_fields_from_bootstrap_demos: bool = False#
use_weighted_sampling(weighted: bool)[source]#
config_shots(*args, **kwargs)[source]#

Initialize the samples for each parameter.

set_dataset(dataset: Sequence[DataClass])[source]#

Set the dataset for the optimizer.

class TextOptimizer(*args, **kwargs)[source]#

Bases: Optimizer

Base class for all text optimizers.

Text optimizer is via textual gradient descent, which is a variant of gradient descent that optimizes the text directly. It will generate new values for a given text prompt.This includes: - System prompt - output format - prompt template

zero_grad()[source]#

Clear all the gradients of the parameters.