[ allennlp.interpret.attackers.attacker ]
class Attacker(Registrable): | def __init__(self, predictor: Predictor) -> None
Attacker will modify an input (e.g., add or delete tokens) to try to change an AllenNLP
Predictor's output in a desired manner (e.g., make it incorrect).
class Attacker(Registrable): | ... | def initialize(self)
Initializes any components of the Attacker that are expensive to compute, so that they are
not created on init(). Default implementation is
class Attacker(Registrable): | ... | def attack_from_json( | self, | inputs: JsonDict, | input_field_to_attack: str, | grad_input_field: str, | ignore_tokens: List[str], | target: JsonDict | ) -> JsonDict
This function finds a modification to the input text that would change the model's prediction in some desired manner (e.g., an adversarial attack).
- inputs :
The input you want to attack (the same as the argument to a Predictor, e.g., predict_json()).
- input_field_to_attack :
The key in the inputs JsonDict you want to attack, e.g.,
- grad_input_field :
The field in the gradients dictionary that contains the input gradients. For example,
grad_input_1will be the field for single input tasks. See get_gradients() in
Predictorfor more information on field names.
- target :
If given, this is a
targetedattack, trying to change the prediction to a particular value, instead of just changing it from its original prediction. Subclasses are not required to accept this argument, as not all attacks make sense as targeted attacks. Perhaps that means we should make the API more crisp, but adding another class is not worth it.
- reduced_input :
Contains the final, sanitized input after adversarial modification.