allennlp.data.instance¶
-
class
allennlp.data.instance.Instance(fields: MutableMapping[str, allennlp.data.fields.field.Field])[source]¶ Bases:
collections.abc.Mapping,typing.GenericAn
Instanceis a collection ofFieldobjects, specifying the inputs and outputs to some model. We don’t make a distinction between inputs and outputs here, though - all operations are done on all fields, and when we return arrays, we return them as dictionaries keyed by field name. A model can then decide which fields it wants to use as inputs as which as outputs.The
Fieldsin anInstancecan start out either indexed or un-indexed. During the data processing pipeline, all fields will be indexed, after which multiple instances can be combined into aBatchand then converted into padded arrays.- Parameters
- fields
Dict[str, Field] The
Fieldobjects that will be used to produce data arrays for this instance.
- fields
-
add_field(self, field_name: str, field: allennlp.data.fields.field.Field, vocab: allennlp.data.vocabulary.Vocabulary = None) → None[source]¶ Add the field to the existing fields mapping. If we have already indexed the Instance, then we also index field, so it is necessary to supply the vocab.
-
as_tensor_dict(self, padding_lengths: Dict[str, Dict[str, int]] = None) → Dict[str, ~DataArray][source]¶ Pads each
Fieldin this instance to the lengths given inpadding_lengths(which is keyed by field name, then by padding key, the same as the return value inget_padding_lengths()), returning a list of torch tensors for each field.If
padding_lengthsis omitted, we will callself.get_padding_lengths()to get the sizes of the tensors to create.
-
count_vocab_items(self, counter: Dict[str, Dict[str, int]])[source]¶ Increments counts in the given
counterfor all of the vocabulary items in all of theFieldsin thisInstance.
-
get_padding_lengths(self) → Dict[str, Dict[str, int]][source]¶ Returns a dictionary of padding lengths, keyed by field name. Each
Fieldreturns a mapping from padding keys to actual lengths, and we just key that dictionary by field name.
-
index_fields(self, vocab: allennlp.data.vocabulary.Vocabulary) → None[source]¶ Indexes all fields in this
Instanceusing the providedVocabulary. This mutates the current object, it does not return a newInstance. ADataIteratorwill call this on each pass through a dataset; we use theindexedflag to make sure that indexing only happens once.This means that if for some reason you modify your vocabulary after you’ve indexed your instances, you might get unexpected behavior.