base¶ Abstract base class for evaluation metrics. Classes BaseEvaluator(*args, **kwargs) EvaluationResult(avg_score[, ...]) Evaluation result. class EvaluationResult(avg_score: float, per_item_scores: List[float] | None = None, additional_info: dict | None = None)[source]¶ Bases: object Evaluation result. avg_score: float¶ per_item_scores: List[float] | None = None¶ additional_info: dict | None = None¶ class BaseEvaluator(*args, **kwargs)[source]¶ Bases: object compute_single_item(*args, **kwargs) → float[source]¶ Compute the score for a single item. compute(*args, **kwargs) → Any[source]¶ Evaluate a list of predictions and ground truth values. and return overall score and per-item scores.