adal#

AdalComponent provides an interface to compose different parts, from eval_fn, train_step, loss_step, optimizers, backward engine, teacher generator, etc to work with Trainer.

Classes

AdalComponent(task[, eval_fn, loss_fn, ...])

Define a train, eval, and test step for a task pipeline.

class AdalComponent(task: Component, eval_fn: Callable | None = None, loss_fn: LossComponent | None = None, backward_engine: BackwardEngine | None = None, backward_engine_model_config: Dict | None = None, teacher_model_config: Dict | None = None, text_optimizer_model_config: Dict | None = None, *args, **kwargs)[source]#

Bases: Component

Define a train, eval, and test step for a task pipeline.

This serves the following purposes: 1. Organize all parts for training a task pipeline in one place. 2. Help with debugging and testing before the actual training. 3. Adds multi-threading support for training and evaluation.

task: Component#
eval_fn: Callable | None#
loss_fn: LossComponent | None#
backward_engine: BackwardEngine | None#
handle_one_task_sample(sample: Any, *args, **kwargs) Tuple[Callable, Dict][source]#

Return a task call and kwargs for one training sample.

Example:

def handle_one_task_sample(self, sample: Any, *args, **kwargs) -> Tuple[Callable, Dict]:
    return self.task, {"x": sample.x}
handle_one_loss_sample(sample: Any, y_pred: Parameter, *args, **kwargs) Tuple[Callable, Dict][source]#

Return a loss call and kwargs for one loss sample.

Need to ensure y_pred is a Parameter, and the real input to use for y_gt and y_pred is eval_input. Make sure it is setup.

Example:

# "y" and "y_gt" are arguments needed
#by the eval_fn inside of the loss_fn if it is a EvalFnToTextLoss

def handle_one_loss_sample(self, sample: Example, pred: adal.Parameter) -> Dict:
    # prepare gt parameter
    y_gt = adal.Parameter(
        name="y_gt",
        data=sample.answer,
        eval_input=sample.answer,
        requires_opt=False,
    )

    # pred's full_response is the output of the task pipeline which is GeneratorOutput
    pred.eval_input = pred.full_response.data
    return self.loss_fn, {"kwargs": {"y": y_gt, "y_pred": pred}}
evaluate_one_sample(sample: Any, y_pred: Any, *args, **kwargs) float[source]#

Used to evaluate a single sample. Return a score in range [0, 1]. The higher the score the better the prediction.

configure_optimizers(*args, **kwargs) List[Optimizer][source]#

Note: When you use text optimizor, ensure you call configure_backward_engine_engine too.

configure_backward_engine(*args, **kwargs)[source]#

Configure a backward engine for all generators in the task for bootstrapping examples.

evaluate_samples(samples: Any, y_preds: List, metadata: Dict[str, Any] | None = None, num_workers: int = 2) EvaluationResult[source]#

Run evaluation on samples using parallel processing. Utilizes evaluate_one_sample defined by the user.

Metadata is used for storing context that you can find from generator input.

Parameters:
  • samples (Any) – The input samples to evaluate.

  • y_preds (List) – The predicted outputs corresponding to each sample.

  • metadata (Optional[Dict[str, Any]]) – Optional metadata dictionary.

  • num_workers (int) – Number of worker threads for parallel processing.

Returns:

An object containing the average score and per-item scores.

Return type:

EvaluationResult

pred_step(batch, batch_idx, num_workers: int = 2, running_eval: bool = False, min_score: float | None = None)[source]#

Applies to both train and eval mode.

If you require self.task.train() to be called before training, you can override this method as:

def train_step(self, batch, batch_idx, num_workers: int = 2) -> List:
    self.task.train()
    return super().train_step(batch, batch_idx, num_workers)
train_step(batch, batch_idx, num_workers: int = 2) List[source]#
validate_condition(steps: int, total_steps: int) bool[source]#

In default, trainer will validate at every step.

validation_step(batch, batch_idx, num_workers: int = 2, minimum_score: float | None = None) EvaluationResult[source]#

If you require self.task.eval() to be called before validation, you can override this method as:

def validation_step(self, batch, batch_idx, num_workers: int = 2) -> List:
    self.task.eval()
    return super().validation_step(batch, batch_idx, num_workers)
loss_step(batch, y_preds: List[Parameter], batch_idx, num_workers: int = 2) List[Parameter][source]#

Calculate the loss for the batch.

configure_teacher_generator()[source]#

Configure a teach generator for all generators in the task for bootstrapping examples.

You can call configure_teacher_generator_helper to easily configure it by passing the model_client and model_kwargs.

configure_teacher_generator_helper(model_client: ModelClient, model_kwargs: Dict[str, Any], template: str | None = None)[source]#

Configure a teach generator for all generators in the task for bootstrapping examples.

configure_backward_engine_helper(model_client: ModelClient, model_kwargs: Dict[str, Any], template: str | None = None)[source]#

Configure a backward engine for all generators in the task for bootstrapping examples.

configure_callbacks(save_dir: str | None = 'traces', *args, **kwargs)[source]#

In default we config the failure generator callback. User can overwrite this method to add more callbacks.

run_one_task_sample(sample: Any) Any[source]#

Run one training sample. Used for debugging and testing.

training: bool#
run_one_loss_sample(sample: Any, y_pred: Any) Any[source]#

Run one loss sample. Used for debugging and testing.

configure_demo_optimizer_helper() List[DemoOptimizer][source]#

One demo optimizer can handle multiple demo parameters. But the demo optimizer will only have one dataset (trainset) configured by the Trainer.

If users want to use different trainset for different demo optimizer, they can configure it by themselves.

configure_text_optimizer_helper(model_client: ModelClient, model_kwargs: Dict[str, Any]) List[TextOptimizer][source]#

One text optimizer can handle multiple text parameters.