create_elmo_embeddings_from_vocab
allennlp.tools.create_elmo_embeddings_from_vocab
main#
def main(
vocab_path: str,
elmo_config_path: str,
elmo_weights_path: str,
output_dir: str,
batch_size: int,
device: int,
use_custom_oov_token: bool = False
)
Creates ELMo word representations from a vocabulary file. These
word representations are independent - they are the result of running
the CNN and Highway layers of the ELMo model, but not the Bidirectional LSTM.
ELMo requires 2 additional tokens: and . The first token
in this file is assumed to be an unknown token.
This script produces two artifacts: A new vocabulary file
with the and tokens inserted and a glove formatted embedding
file containing word : vector pairs, one per line, with all values
separated by a space.