evaluation
allennlp.tango.evaluation
AllenNLP Tango is an experimental API and parts of it might change or disappear every time we release a new version.
EvaluationStep¶
@Step.register("evaluation")
class EvaluationStep(Step)
This step evaluates a given model on a given dataset.
DETERMINISTIC¶
class EvaluationStep(Step):
| ...
| DETERMINISTIC = True
VERSION¶
class EvaluationStep(Step):
| ...
| VERSION = "002"
FORMAT¶
class EvaluationStep(Step):
| ...
| FORMAT: Format = JsonFormat(compress="gz")
EvaluationResult¶
@dataclasses.dataclass
class EvaluationResult
metrics¶
class EvaluationResult:
| ...
| metrics: Dict[str, Any] = None
predictions¶
class EvaluationResult:
| ...
| predictions: List[Dict[str, Any]] = None
run¶
class EvaluationStep(Step):
| ...
| def run(
| self,
| model: Model,
| dataset: DatasetDict,
| split: str = "validation",
| data_loader: Optional[Lazy[TangoDataLoader]] = None
| ) -> EvaluationResult
Runs an evaluation on a dataset.
model
is the model we want to evaluate.dataset
is the dataset we want to evaluate on.split
is the name of the split we want to evaluate on.data_loader
gives you the chance to choose a custom dataloader for the evaluation. By default this step evaluates on batches of 32 instances each.