base

Abstract base class for evaluation metrics.

Classes

BaseEvaluator(*args, **kwargs)

EvaluationResult(avg_score[, ...])

Evaluation result.

class EvaluationResult(avg_score: float, per_item_scores: List[float] | None = None, additional_info: dict | None = None)[source]

Bases: object

Evaluation result.

avg_score: float
per_item_scores: List[float] | None = None
additional_info: dict | None = None
class BaseEvaluator(*args, **kwargs)[source]

Bases: object

compute_single_item(*args, **kwargs) float[source]

Compute the score for a single item.

compute(*args, **kwargs) Any[source]

Evaluate a list of predictions and ground truth values. and return overall score and per-item scores.