fbeta_verbose_measure

allennlp.training.metrics.fbeta_verbose_measure

FBetaVerboseMeasure¶

@Metric.register("fbeta_verbose")
class FBetaVerboseMeasure(FBetaMeasure):
 | def __init__(
 |     self,
 |     beta: float = 1.0,
 |     labels: List[int] = None,
 |     index_to_label: Dict[int, str] = None
 | ) -> None

Compute precision, recall, F-measure and support for each class.

This is basically the same as FBetaMeasure (the super class) with two differences: - it always returns a dictionary of floats, while FBetaMeasure can return a dictionary of lists (one element for each class). - it always returns precision, recall and F-measure for each class and also three averaged values for each metric: micro, macro and weighted averages.

The returned dictionary contains keys with the following format: -precision : float -recall : float -fscore : float -precision : float -recall : float -fscore : float where is the index (or the label if index_to_label is given) of each class; and is micro, macro and weighted, one for each kind of average.

The precision is the ratio tp / (tp + fp) where tp is the number of true positives and fp the number of false positives. The precision is intuitively the ability of the classifier not to label as positive a sample that is negative.

The recall is the ratio tp / (tp + fn) where tp is the number of true positives and fn the number of false negatives. The recall is intuitively the ability of the classifier to find all the positive samples.

The F-beta score can be interpreted as a weighted harmonic mean of the precision and recall, where an F-beta score reaches its best value at 1 and worst score at 0.

If we have precision and recall, the F-beta score is simply: F-beta = (1 + beta ** 2) * precision * recall / (beta ** 2 * precision + recall)

The F-beta score weights recall more than precision by a factor of beta. beta == 1.0 means recall and precision are equally important.

The support is the number of occurrences of each class in y_true.

Parameters¶

beta : float, optional (default = 1.0)
The strength of recall versus precision in the F-score.
labels : List[int], optional (default = None)
The set of labels to include. Labels present in the data can be excluded, for example, to calculate a multi-class average ignoring a majority negative class. Labels not present in the data will result in 0 components in a macro or weighted average.
index_to_label : Dict[int, str], optional (default = None)
A dictionary mapping indices to the corresponding label. If this map is giving, the provided metrics include the label instead of the index for each class.

get_metric¶

class FBetaVerboseMeasure(FBetaMeasure):
 | ...
 | def get_metric(self, reset: bool = False)

Returns¶

-precision : float
-recall : float
-fscore : float
-precision : float
-recall : float
-fscore : float
where is the index (or the label if index_to_label is given)
of each class; and is micro, macro and weighted, one for
each kind of average.