Skip to content




class VqaMeasure(Metric):
 | def __init__(self) -> None

Compute the VQA metric, as described in

In VQA, we take the answer with the highest score, and then we find out how often humans decided this was the right answer. The accuracy score for an answer is min(1.0, human_count / 3).

This metric takes the logits from the models, i.e., a score for each possible answer, and the labels for the question, together with their weights.


class VqaMeasure(Metric):
 | ...
 | def __call__(
 |     self,
 |     logits: torch.Tensor,
 |     labels: torch.Tensor,
 |     label_weights: torch.Tensor
 | )


  • logits : torch.Tensor
    A tensor of predictions of shape (batch_size, num_classes).
  • labels : torch.Tensor
    A tensor of integer class label of shape (batch_size, num_labels).
  • label_weights : torch.Tensor
    A tensor of floats of shape (batch_size, num_labels), giving a weight or score to every one of the labels.


class VqaMeasure(Metric):
 | ...
 | def get_metric(self, reset: bool = False)


class VqaMeasure(Metric):
 | ...
 | def reset(self) -> None