Skip to content

create_elmo_embeddings_from_vocab

allennlp.tools.create_elmo_embeddings_from_vocab

[SOURCE]


main#

def main(
    vocab_path: str,
    elmo_config_path: str,
    elmo_weights_path: str,
    output_dir: str,
    batch_size: int,
    device: int,
    use_custom_oov_token: bool = False
)

Creates ELMo word representations from a vocabulary file. These word representations are independent - they are the result of running the CNN and Highway layers of the ELMo model, but not the Bidirectional LSTM. ELMo requires 2 additional tokens: and . The first token in this file is assumed to be an unknown token.

This script produces two artifacts: A new vocabulary file with the and tokens inserted and a glove formatted embedding file containing word : vector pairs, one per line, with all values separated by a space.