allennlp.data.instance¶
-
class
allennlp.data.instance.
Instance
(fields: MutableMapping[str, allennlp.data.fields.field.Field])[source]¶ Bases:
collections.abc.Mapping
,typing.Generic
An
Instance
is a collection ofField
objects, specifying the inputs and outputs to some model. We don’t make a distinction between inputs and outputs here, though - all operations are done on all fields, and when we return arrays, we return them as dictionaries keyed by field name. A model can then decide which fields it wants to use as inputs as which as outputs.The
Fields
in anInstance
can start out either indexed or un-indexed. During the data processing pipeline, all fields will be indexed, after which multiple instances can be combined into aBatch
and then converted into padded arrays.- Parameters
- fields
Dict[str, Field]
The
Field
objects that will be used to produce data arrays for this instance.
- fields
-
add_field
(self, field_name: str, field: allennlp.data.fields.field.Field, vocab: allennlp.data.vocabulary.Vocabulary = None) → None[source]¶ Add the field to the existing fields mapping. If we have already indexed the Instance, then we also index field, so it is necessary to supply the vocab.
-
as_tensor_dict
(self, padding_lengths: Dict[str, Dict[str, int]] = None) → Dict[str, ~DataArray][source]¶ Pads each
Field
in this instance to the lengths given inpadding_lengths
(which is keyed by field name, then by padding key, the same as the return value inget_padding_lengths()
), returning a list of torch tensors for each field.If
padding_lengths
is omitted, we will callself.get_padding_lengths()
to get the sizes of the tensors to create.
-
count_vocab_items
(self, counter: Dict[str, Dict[str, int]])[source]¶ Increments counts in the given
counter
for all of the vocabulary items in all of theFields
in thisInstance
.
-
get_padding_lengths
(self) → Dict[str, Dict[str, int]][source]¶ Returns a dictionary of padding lengths, keyed by field name. Each
Field
returns a mapping from padding keys to actual lengths, and we just key that dictionary by field name.
-
index_fields
(self, vocab: allennlp.data.vocabulary.Vocabulary) → None[source]¶ Indexes all fields in this
Instance
using the providedVocabulary
. This mutates the current object, it does not return a newInstance
. ADataIterator
will call this on each pass through a dataset; we use theindexed
flag to make sure that indexing only happens once.This means that if for some reason you modify your vocabulary after you’ve indexed your instances, you might get unexpected behavior.