rouge

[ allennlp.training.metrics.rouge ]

ROUGE Objects#

class ROUGE(Metric):
 | def __init__(
 |     self,
 |     ngram_size: int = 2,
 |     exclude_indices: Set[int] = None
 | ) -> None

Recall-Oriented Understudy for Gisting Evaluation (ROUGE)

ROUGE is a metric for measuring the quality of summaries. It is based on calculating the recall between ngrams in the predicted summary and a set of reference summaries. See [Lin, "ROUGE: A Package For Automatic Evaluation Of Summaries", 2004] (https://api.semanticscholar.org/CorpusID:964287).

Parameters

ngram_size : int, optional (default = 2)
ROUGE scores are calculate for ROUGE-1 .. ROUGE-ngram_size
exclude_indices : Set[int], optional (default = None)
Indices to exclude when calculating ngrams. This should usually include the indices of the start, end, and pad tokens.

reset#

 | @overrides
 | def reset(self) -> None

get_metric#

 | @overrides
 | def get_metric(self, reset: bool = False) -> Dict[str, float]

Parameters

reset : bool, optional (default = False)
Reset any accumulators or internal state.

Returns

Dict[str, float]:
A dictionary containing ROUGE-1 .. ROUGE-ngram_size scores.