# fbeta_multi_label_measure

[ allennlp.training.metrics.fbeta_multi_label_measure ]

## FBetaMultiLabelMeasure#

``````class FBetaMultiLabelMeasure(FBetaMeasure):
| def __init__(
|     self,
|     beta: float = 1.0,
|     average: str = None,
|     labels: List[int] = None,
|     threshold: float = 0.5
| ) -> None
``````

Compute precision, recall, F-measure and support for multi-label classification.

The precision is the ratio `tp / (tp + fp)` where `tp` is the number of true positives and `fp` the number of false positives. The precision is intuitively the ability of the classifier not to label as positive a sample that is negative.

The recall is the ratio `tp / (tp + fn)` where `tp` is the number of true positives and `fn` the number of false negatives. The recall is intuitively the ability of the classifier to find all the positive samples.

The F-beta score can be interpreted as a weighted harmonic mean of the precision and recall, where an F-beta score reaches its best value at 1 and worst score at 0.

If we have precision and recall, the F-beta score is simply: `F-beta = (1 + beta ** 2) * precision * recall / (beta ** 2 * precision + recall)`

The F-beta score weights recall more than precision by a factor of `beta`. `beta == 1.0` means recall and precision are equally important.

The support is the number of occurrences of each class in `y_true`.

Parameters

• beta : `float`, optional (default = `1.0`)
The strength of recall versus precision in the F-score.

• average : `str`, optional (default = `None`)
If `None`, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:

`'micro'`: Calculate metrics globally by counting the total true positives, false negatives and false positives. `'macro'`: Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account. `'weighted'`: Calculate metrics for each label, and find their average weighted by support (the number of true instances for each label). This alters 'macro' to account for label imbalance; it can result in an F-score that is not between precision and recall.

• labels : `list`, optional
The set of labels to include and their order if `average is None`. Labels present in the data can be excluded, for example to calculate a multi-class average ignoring a majority negative class. Labels not present in the data will result in 0 components in a macro or weighted average.

• threshold : `float`, optional (default = `0.5`)
Logits over this threshold will be considered predictions for the corresponding class.

### __call__#

``````class FBetaMultiLabelMeasure(FBetaMeasure):
| ...
| @overrides
| def __call__(
|     self,
|     predictions: torch.Tensor,
|     gold_labels: torch.Tensor,
| )
``````

Parameters

• predictions : `torch.Tensor`
A tensor of predictions of shape (batch_size, ..., num_classes).
• gold_labels : `torch.Tensor`
A tensor of integer class label of shape (batch_size, ...). It must be the same shape as the `predictions` tensor without the `num_classes` dimension.
• mask : `torch.BoolTensor`, optional (default = `None`)
A masking tensor the same size as `gold_labels`.

## F1MultiLabelMeasure#

``````class F1MultiLabelMeasure(FBetaMultiLabelMeasure):
| def __init__(
|     self,
|     average: str = None,
|     labels: List[int] = None,
|     threshold: float = 0.5
| ) -> None
``````