base#

Abstract base class for evaluation metrics.

Classes

BaseEvaluator(*args, **kwargs)

EvaluationResult(avg_score[, ...])

Evaluation result.

class EvaluationResult(avg_score: float, per_item_scores: List[float] | None = None, additional_info: dict | None = None)[source]#

Bases: object

Evaluation result.

avg_score: float#
per_item_scores: List[float] | None = None#
additional_info: dict | None = None#
class BaseEvaluator(*args, **kwargs)[source]#

Bases: object

compute_single_item(*args, **kwargs) float[source]#

Compute the score for a single item.

compute(*args, **kwargs) Any[source]#

Evaluate a list of predictions and ground truth values. and return overall score and per-item scores.