Skip to content





class InputReduction(Attacker):
 | def __init__(self, predictor: Predictor, beam_size: int = 3) -> None

Runs the input reduction method from Pathologies of Neural Models Make Interpretations Difficult, which removes as many words as possible from the input without changing the model's prediction.

The functions on this class handle a special case for NER by looking for a field called "tags" This check is brittle, i.e., the code could break if the name of this field has changed, or if a non-NER model has a field called "tags".

Registered as an Attacker with name "input-reduction".


class InputReduction(Attacker):
 | ...
 | def attack_from_json(
 |     self,
 |     inputs: JsonDict,
 |     input_field_to_attack: str = "tokens",
 |     grad_input_field: str = "grad_input_1",
 |     ignore_tokens: List[str] = None,
 |     target: JsonDict = None
 | )