allennlp.predictors¶
A Predictor
is
a wrapper for an AllenNLP Model
that makes JSON predictions using JSON inputs. If you
want to serve up a model through the web service
(or using allennlp.commands.predict
), you’ll need
a Predictor
that wraps it.
-
class
allennlp.predictors.predictor.
Predictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]¶ Bases:
allennlp.common.registrable.Registrable
a
Predictor
is a thin wrapper around an AllenNLP model that handles JSON -> JSON predictions that can be used for serving models through the web API or making predictions in bulk.-
capture_model_internals
(self) → Iterator[dict][source]¶ Context manager that captures the internal-module outputs of this predictor’s model. The idea is that you could use it as follows:
with predictor.capture_model_internals() as internals: outputs = predictor.predict_json(inputs) return {**outputs, "model_internals": internals}
-
dump_line
(self, outputs: Dict[str, Any]) → str[source]¶ If you don’t want your outputs in JSON-lines format you can override this function to output them differently.
-
classmethod
from_archive
(archive: allennlp.models.archival.Archive, predictor_name: str = None, dataset_reader_to_load: str = 'validation') → 'Predictor'[source]¶ Instantiate a
Predictor
from anArchive
; that is, from the result of training a model. Optionally specify which Predictor subclass; otherwise, the default one for the model will be used. Optionally specify whichDatasetReader
should be loaded; otherwise, the validation one will be used if it exists followed by the training dataset reader.
-
classmethod
from_path
(archive_path: str, predictor_name: str = None, cuda_device: int = -1, dataset_reader_to_load: str = 'validation') → 'Predictor'[source]¶ Instantiate a
Predictor
from an archive path.If you need more detailed configuration options, such as overrides, please use from_archive.
- Parameters
- archive_path: ``str``
The path to the archive.
- predictor_name: ``str``, optional (default=None)
Name that the predictor is registered as, or None to use the predictor associated with the model.
- cuda_device: ``int``, optional (default=-1)
If cuda_device is >= 0, the model will be loaded onto the corresponding GPU. Otherwise it will be loaded onto the CPU.
- dataset_reader_to_load: ``str``, optional (default=”validation”)
Which dataset reader to load from the archive, either “train” or “validation”.
- Returns
- A Predictor instance.
-
get_gradients
(self, instances: List[allennlp.data.instance.Instance]) → Tuple[Dict[str, Any], Dict[str, Any]][source]¶ Gets the gradients of the loss with respect to the model inputs.
- Parameters
- instances: List[Instance]
- Returns
- Tuple[Dict[str, Any], Dict[str, Any]]
- The first item is a Dict of gradient entries for each input.
- The keys have the form
{grad_input_1: ..., grad_input_2: ... }
- up to the number of inputs given. The second item is the model’s output.
Notes
Takes a
JsonDict
representing the inputs of the model and converts them toInstance`s, sends these through the model :func:`forward
function after registering hooks on the embedding layer of the model. Callsbackward()
on the loss and then removes the hooks.
-
json_to_labeled_instances
(self, inputs: Dict[str, Any]) → List[allennlp.data.instance.Instance][source]¶ Converts incoming json to a
Instance
, runs the model on the newly created instance, and adds labels to theInstance
-
load_line
(self, line: str) → Dict[str, Any][source]¶ If your inputs are not in JSON-lines format (e.g. you have a CSV) you can override this function to parse them correctly.
-
predict_batch_instance
(self, instances: List[allennlp.data.instance.Instance]) → List[Dict[str, Any]][source]¶
-
predictions_to_labeled_instances
(self, instance: allennlp.data.instance.Instance, outputs: Dict[str, numpy.ndarray]) → List[allennlp.data.instance.Instance][source]¶ This function takes a model’s outputs for an Instance, and it labels that instance according to the output. For example, in classification this function labels the instance according to the class with the highest probability. This function is used to to compute gradients of what the model predicted. The return type is a list because in some tasks there are multiple predictions in the output (e.g., in NER a model predicts multiple spans). In this case, each instance in the returned list of Instances contains an individual entity prediction as the label.
-
-
class
allennlp.predictors.bidaf.
BidafPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]¶ Bases:
allennlp.predictors.predictor.Predictor
Predictor for the
BidirectionalAttentionFlow
model.-
predict
(self, question: str, passage: str) → Dict[str, Any][source]¶ Make a machine comprehension prediction on the supplied input. See https://rajpurkar.github.io/SQuAD-explorer/ for more information about the machine comprehension task.
- Parameters
- question
str
A question about the content in the supplied paragraph. The question must be answerable by a span in the paragraph.
- passage
str
A paragraph of information relevant to the question.
- question
- Returns
- A dictionary that represents the prediction made by the system. The answer string will be under the
- “best_span_str” key.
-
predictions_to_labeled_instances
(self, instance: allennlp.data.instance.Instance, outputs: Dict[str, numpy.ndarray]) → List[allennlp.data.instance.Instance][source]¶ This function takes a model’s outputs for an Instance, and it labels that instance according to the output. For example, in classification this function labels the instance according to the class with the highest probability. This function is used to to compute gradients of what the model predicted. The return type is a list because in some tasks there are multiple predictions in the output (e.g., in NER a model predicts multiple spans). In this case, each instance in the returned list of Instances contains an individual entity prediction as the label.
-
-
class
allennlp.predictors.decomposable_attention.
DecomposableAttentionPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]¶ Bases:
allennlp.predictors.predictor.Predictor
Predictor for the
DecomposableAttention
model.-
predict
(self, premise: str, hypothesis: str) → Dict[str, Any][source]¶ Predicts whether the hypothesis is entailed by the premise text.
- Parameters
- premise
str
A passage representing what is assumed to be true.
- hypothesis
str
A sentence that may be entailed by the premise.
- premise
- Returns
- A dictionary where the key “label_probs” determines the probabilities of each of
- [entailment, contradiction, neutral].
-
predictions_to_labeled_instances
(self, instance: allennlp.data.instance.Instance, outputs: Dict[str, numpy.ndarray]) → List[allennlp.data.instance.Instance][source]¶ This function takes a model’s outputs for an Instance, and it labels that instance according to the output. For example, in classification this function labels the instance according to the class with the highest probability. This function is used to to compute gradients of what the model predicted. The return type is a list because in some tasks there are multiple predictions in the output (e.g., in NER a model predicts multiple spans). In this case, each instance in the returned list of Instances contains an individual entity prediction as the label.
-
-
class
allennlp.predictors.dialog_qa.
DialogQAPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader, language: str = 'en_core_web_sm')[source]¶ Bases:
allennlp.predictors.predictor.Predictor
-
predict
(self, jsonline: str) → Dict[str, Any][source]¶ Make a dialog-style question answering prediction on the supplied input. The supplied input json must contain a list of question answer pairs, containing question, answer, yesno, followup, id as well as the context (passage).
- Parameters
- jsonline: ``str``
A json line that has the same format as the quac data file.
- Returns
- A dictionary that represents the prediction made by the system. The answer string will be under the
- “best_span_str” key.
-
-
class
allennlp.predictors.semantic_role_labeler.
SemanticRoleLabelerPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader, language: str = 'en_core_web_sm')[source]¶ Bases:
allennlp.predictors.predictor.Predictor
Predictor for the
SemanticRoleLabeler
model.-
predict
(self, sentence: str) → Dict[str, Any][source]¶ Predicts the semantic roles of the supplied sentence and returns a dictionary with the results.
{"words": [...], "verbs": [ {"verb": "...", "description": "...", "tags": [...]}, ... {"verb": "...", "description": "...", "tags": [...]}, ]}
- Parameters
- sentence, ``str``
The sentence to parse via semantic role labeling.
- Returns
- A dictionary representation of the semantic roles in the sentence.
-
predict_batch_json
(self, inputs: List[Dict[str, Any]]) → List[Dict[str, Any]][source]¶ Expects JSON that looks like
[{"sentence": "..."}, {"sentence": "..."}, ...]
and returns JSON that looks like[ {"words": [...], "verbs": [ {"verb": "...", "description": "...", "tags": [...]}, ... {"verb": "...", "description": "...", "tags": [...]}, ]}, {"words": [...], "verbs": [ {"verb": "...", "description": "...", "tags": [...]}, ... {"verb": "...", "description": "...", "tags": [...]}, ]} ]
-
predict_json
(self, inputs: Dict[str, Any]) → Dict[str, Any][source]¶ Expects JSON that looks like
{"sentence": "..."}
and returns JSON that looks like{"words": [...], "verbs": [ {"verb": "...", "description": "...", "tags": [...]}, ... {"verb": "...", "description": "...", "tags": [...]}, ]}
-
predict_tokenized
(self, tokenized_sentence: List[str]) → Dict[str, Any][source]¶ Predicts the semantic roles of the supplied sentence tokens and returns a dictionary with the results.
- Parameters
- tokenized_sentence, ``List[str]``
The sentence tokens to parse via semantic role labeling.
- Returns
- A dictionary representation of the semantic roles in the sentence.
-
-
class
allennlp.predictors.sentence_tagger.
SentenceTaggerPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader, language: str = 'en_core_web_sm')[source]¶ Bases:
allennlp.predictors.predictor.Predictor
Predictor for any model that takes in a sentence and returns a single set of tags for it. In particular, it can be used with the
CrfTagger
model and also theSimpleTagger
model.-
predictions_to_labeled_instances
(self, instance: allennlp.data.instance.Instance, outputs: Dict[str, numpy.ndarray]) → List[allennlp.data.instance.Instance][source]¶ This function currently only handles BIOUL tags.
Imagine an NER model predicts three named entities (each one with potentially multiple tokens). For each individual entity, we create a new Instance that has the label set to only that entity and the rest of the tokens are labeled as outside. We then return a list of those Instances.
For example: Mary went to Seattle to visit Microsoft Research U-Per O O U-Loc O O B-Org L-Org
We create three instances. Mary went to Seattle to visit Microsoft Research U-Per O O O O O O O
Mary went to Seattle to visit Microsoft Research O O O U-LOC O O O O
Mary went to Seattle to visit Microsoft Research O O O O O O B-Org L-Org
-
-
class
allennlp.predictors.coref.
CorefPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader, language: str = 'en_core_web_sm')[source]¶ Bases:
allennlp.predictors.predictor.Predictor
Predictor for the
CoreferenceResolver
model.-
coref_resolved
(self, document: str) → str[source]¶ Produce a document where each coreference is replaced by the its main mention
- Parameters
- document
str
A string representation of a document.
- document
- Returns
- A string with each coference replaced by its main mention
-
predict
(self, document: str) → Dict[str, Any][source]¶ Predict the coreference clusters in the given document.
{ "document": [tokenised document text] "clusters": [ [ [start_index, end_index], [start_index, end_index] ], [ [start_index, end_index], [start_index, end_index], [start_index, end_index], ], .... ] }
- Parameters
- document
str
A string representation of a document.
- document
- Returns
- A dictionary representation of the predicted coreference clusters.
-
predict_tokenized
(self, tokenized_document: List[str]) → Dict[str, Any][source]¶ Predict the coreference clusters in the given document.
- Parameters
- tokenized_document
List[str]
A list of words representation of a tokenized document.
- tokenized_document
- Returns
- A dictionary representation of the predicted coreference clusters.
-
predictions_to_labeled_instances
(self, instance: allennlp.data.instance.Instance, outputs: Dict[str, numpy.ndarray]) → List[allennlp.data.instance.Instance][source]¶ Takes each predicted cluster and makes it into a labeled
Instance
with only that cluster labeled, so we can compute gradients of the loss on the model’s prediction of that cluster. This lets us run interpretation methods using those gradients. See superclass docstring for more info.
-
-
class
allennlp.predictors.constituency_parser.
ConstituencyParserPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader, language: str = 'en_core_web_sm')[source]¶ Bases:
allennlp.predictors.predictor.Predictor
Predictor for the
SpanConstituencyParser
model.-
predict
(self, sentence: str) → Dict[str, Any][source]¶ Predict a constituency parse for the given sentence. Parameters ———- sentence The sentence to parse.
- Returns
- A dictionary representation of the constituency tree.
-
-
class
allennlp.predictors.seq2seq.
Seq2SeqPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]¶ Bases:
allennlp.predictors.predictor.Predictor
Predictor for sequence to sequence models, including
composed_seq2seq
andsimple_seq2seq
andcopynet_seq2seq
.
-
class
allennlp.predictors.simple_seq2seq.
SimpleSeq2SeqPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]¶ Bases:
allennlp.predictors.seq2seq.Seq2SeqPredictor
Predictor for the
simple_seq2seq
model.
-
class
allennlp.predictors.wikitables_parser.
WikiTablesParserPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]¶ Bases:
allennlp.predictors.predictor.Predictor
Wrapper for the
WikiTablesSemanticParser
model.
-
class
allennlp.predictors.nlvr_parser.
NlvrParserPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]¶
-
class
allennlp.predictors.quarel_parser.
QuarelParserPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]¶ Bases:
allennlp.predictors.predictor.Predictor
Wrapper for the quarel_semantic_parser model.
-
class
allennlp.predictors.biaffine_dependency_parser.
BiaffineDependencyParserPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader, language: str = 'en_core_web_sm')[source]¶ Bases:
allennlp.predictors.predictor.Predictor
Predictor for the
BiaffineDependencyParser
model.-
predict
(self, sentence: str) → Dict[str, Any][source]¶ Predict a dependency parse for the given sentence. Parameters ———- sentence The sentence to parse.
- Returns
- A dictionary representation of the dependency tree.
-
-
class
allennlp.predictors.open_information_extraction.
OpenIePredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]¶ Bases:
allennlp.predictors.predictor.Predictor
Predictor for the :class: models.SemanticRolelabeler model (in its Open Information variant). Used by online demo and for prediction on an input file using command line.
-
predict_json
(self, inputs: Dict[str, Any]) → Dict[str, Any][source]¶ Create instance(s) after predicting the format. One sentence containing multiple verbs will lead to multiple instances.
Expects JSON that looks like
{"sentence": "..."}
Returns a JSON that looks like
{"tokens": [...], "tag_spans": [{"ARG0": "...", "V": "...", "ARG1": "...", ...}]}
-
-
allennlp.predictors.open_information_extraction.
consolidate_predictions
(outputs: List[List[str]], sent_tokens: List[allennlp.data.tokenizers.token.Token]) → Dict[str, List[str]][source]¶ Identify that certain predicates are part of a multiword predicate (e.g., “decided to run”) in which case, we don’t need to return the embedded predicate (“run”).
-
allennlp.predictors.open_information_extraction.
get_coherent_next_tag
(prev_label: str, cur_label: str) → str[source]¶ Generate a coherent tag, given previous tag and current label.
-
allennlp.predictors.open_information_extraction.
get_predicate_indices
(tags: List[str]) → List[int][source]¶ Return the word indices of a predicate in BIO tags.
-
allennlp.predictors.open_information_extraction.
get_predicate_text
(sent_tokens: List[allennlp.data.tokenizers.token.Token], tags: List[str]) → str[source]¶ Get the predicate in this prediction.
-
allennlp.predictors.open_information_extraction.
join_mwp
(tags: List[str]) → List[str][source]¶ Join multi-word predicates to a single predicate (‘V’) token.
-
allennlp.predictors.open_information_extraction.
make_oie_string
(tokens: List[allennlp.data.tokenizers.token.Token], tags: List[str]) → str[source]¶ Converts a list of model outputs (i.e., a list of lists of bio tags, each pertaining to a single word), returns an inline bracket representation of the prediction.
-
allennlp.predictors.open_information_extraction.
merge_overlapping_predictions
(tags1: List[str], tags2: List[str]) → List[str][source]¶ Merge two predictions into one. Assumes the predicate in tags1 overlap with the predicate of tags2.
-
allennlp.predictors.open_information_extraction.
predicates_overlap
(tags1: List[str], tags2: List[str]) → bool[source]¶ Tests whether the predicate in BIO tags1 overlap with those of tags2.
-
allennlp.predictors.open_information_extraction.
sanitize_label
(label: str) → str[source]¶ Sanitize a BIO label - this deals with OIE labels sometimes having some noise, as parentheses.
-
class
allennlp.predictors.event2mind.
Event2MindPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]¶ Bases:
allennlp.predictors.predictor.Predictor
Predictor for the
event2mind
model.-
predict
(self, source: str) → Dict[str, Any][source]¶ Given a source string of some event, returns a JSON dictionary containing, for each target type, the top predicted sequences as indices, as tokens and the log probability of each.
The JSON dictionary looks like:
{ `${target_type}_top_k_predictions`: [[1, 2, 3], [4, 5, 6], ...], `${target_type}_top_k_predicted_tokens`: [["to", "feel", "brave"], ...], `${target_type}_top_k_log_probabilities`: [-0.301, -0.046, ...] }
By default
target_type
can be xreact, oreact and xintent.
-
-
class
allennlp.predictors.atis_parser.
AtisParserPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]¶ Bases:
allennlp.predictors.predictor.Predictor
Predictor for the
AtisSemanticParser
model.
-
class
allennlp.predictors.text_classifier.
TextClassifierPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]¶ Bases:
allennlp.predictors.predictor.Predictor
Predictor for any model that takes in a sentence and returns a single class for it. In particular, it can be used with the
BasicClassifier
model-
predictions_to_labeled_instances
(self, instance: allennlp.data.instance.Instance, outputs: Dict[str, numpy.ndarray]) → List[allennlp.data.instance.Instance][source]¶ This function takes a model’s outputs for an Instance, and it labels that instance according to the output. For example, in classification this function labels the instance according to the class with the highest probability. This function is used to to compute gradients of what the model predicted. The return type is a list because in some tasks there are multiple predictions in the output (e.g., in NER a model predicts multiple spans). In this case, each instance in the returned list of Instances contains an individual entity prediction as the label.
-
-
class
allennlp.predictors.masked_language_model.
MaskedLanguageModelPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]¶ Bases:
allennlp.predictors.predictor.Predictor
-
predictions_to_labeled_instances
(self, instance: allennlp.data.instance.Instance, outputs: Dict[str, numpy.ndarray])[source]¶ This function takes a model’s outputs for an Instance, and it labels that instance according to the output. For example, in classification this function labels the instance according to the class with the highest probability. This function is used to to compute gradients of what the model predicted. The return type is a list because in some tasks there are multiple predictions in the output (e.g., in NER a model predicts multiple spans). In this case, each instance in the returned list of Instances contains an individual entity prediction as the label.
-
-
class
allennlp.predictors.next_token_lm.
NextTokenLMPredictor
(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]¶ Bases:
allennlp.predictors.predictor.Predictor
-
predictions_to_labeled_instances
(self, instance: allennlp.data.instance.Instance, outputs: Dict[str, numpy.ndarray])[source]¶ This function takes a model’s outputs for an Instance, and it labels that instance according to the output. For example, in classification this function labels the instance according to the class with the highest probability. This function is used to to compute gradients of what the model predicted. The return type is a list because in some tasks there are multiple predictions in the output (e.g., in NER a model predicts multiple spans). In this case, each instance in the returned list of Instances contains an individual entity prediction as the label.
-