squad
Official evaluation script for v1.1 of the SQuAD dataset.
normalize_answer#
def normalize_answer(s)
Lower text and remove punctuation, articles and extra whitespace.
f1_score#
def f1_score(prediction, ground_truth)
exact_match_score#
def exact_match_score(prediction, ground_truth)
metric_max_over_ground_truths#
def metric_max_over_ground_truths(
metric_fn,
prediction,
ground_truths
)
evaluate#
def evaluate(dataset, predictions)