openie

join_mwp#

def join_mwp(tags: List[str]) -> List[str]

Join multi-word predicates to a single predicate ('V') token.

make_oie_string#

def make_oie_string(tokens: List[Token], tags: List[str]) -> str

Converts a list of model outputs (i.e., a list of lists of bio tags, each pertaining to a single word), returns an inline bracket representation of the prediction.

get_predicate_indices#

def get_predicate_indices(tags: List[str]) -> List[int]

Return the word indices of a predicate in BIO tags.

get_predicate_text#

def get_predicate_text(
    sent_tokens: List[Token],
    tags: List[str]
) -> str

Get the predicate in this prediction.

predicates_overlap#

def predicates_overlap(tags1: List[str], tags2: List[str]) -> bool

Tests whether the predicate in BIO tags1 overlap with those of tags2.

get_coherent_next_tag#

def get_coherent_next_tag(prev_label: str, cur_label: str) -> str

Generate a coherent tag, given previous tag and current label.

merge_overlapping_predictions#

def merge_overlapping_predictions(
    tags1: List[str],
    tags2: List[str]
) -> List[str]

Merge two predictions into one. Assumes the predicate in tags1 overlap with the predicate of tags2.

consolidate_predictions#

def consolidate_predictions(
    outputs: List[List[str]],
    sent_tokens: List[Token]
) -> Dict[str, List[str]]

Identify that certain predicates are part of a multiword predicate (e.g., "decided to run") in which case, we don't need to return the embedded predicate ("run").

sanitize_label#

def sanitize_label(label: str) -> str

Sanitize a BIO label - this deals with OIE labels sometimes having some noise, as parentheses.

OpenIePredictor#

class OpenIePredictor(Predictor):
 | def __init__(
 |     self,
 |     model: Model,
 |     dataset_reader: DatasetReader
 | ) -> None

Predictor for the SemanticRolelabeler model (in its Open Information variant). Used by online demo and for prediction on an input file using command line.

predict_json#

class OpenIePredictor(Predictor):
 | ...
 | @overrides
 | def predict_json(self, inputs: JsonDict) -> JsonDict

Create instance(s) after predicting the format. One sentence containing multiple verbs will lead to multiple instances.

Expects JSON that looks like {"sentence": "..."}

Returns a JSON that looks like:

{"tokens": [...],
 "tag_spans": [{"ARG0": "...",
                "V": "...",
                "ARG1": "...",
                 ...}]}