bias_mitigators

allennlp.fairness.bias_mitigators

A suite of differentiable methods to mitigate biases for binary concepts in embeddings.

BiasMitigator¶

class BiasMitigator:
 | def __init__(self, requires_grad: bool = False)

Parent class for bias mitigator classes.

Parameters¶

requires_grad : bool, optional (default = False)
Option to enable gradient calculation.

HardBiasMitigator¶

class HardBiasMitigator(BiasMitigator)

Hard bias mitigator. Mitigates bias in embeddings by:

Neutralizing: ensuring protected variable-neutral words remain equidistant from the bias direction by removing component of embeddings in the bias direction.
Equalizing: ensuring that protected variable-related words are averaged out to have the same norm.

Note

For a detailed walkthrough and visual descriptions of the steps, please refer to Figure 4 in VERB: Visualizing and Interpreting Bias Mitigation Techniques for Word Representations.

Based on: T. Bolukbasi, K. W. Chang, J. Zou, V. Saligrama, and A. Kalai. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In ACM Transactions of Information Systems, 2016.

Description taken from: Goenka, D. (2020). Tackling Gender Bias in Word Embeddings.

Implementation and terminology based on Rathore, A., Dev, S., Phillips, J.M., Srikumar, V., Zheng, Y., Yeh, C.M., Wang, J., Zhang, W., & Wang, B. (2021). VERB: Visualizing and Interpreting Bias Mitigation Techniques for Word Representations. ArXiv, abs/2104.02797.

call¶

class HardBiasMitigator(BiasMitigator):
 | ...
 | def __call__(
 |     self,
 |     evaluation_embeddings: torch.Tensor,
 |     bias_direction: torch.Tensor,
 |     equalize_embeddings1: torch.Tensor,
 |     equalize_embeddings2: torch.Tensor
 | )

Note

In the examples below, we treat gender identity as binary, which does not accurately characterize gender in real life.

Parameters¶

evaluation_embeddings : torch.Tensor
A tensor of size (evaluation_batch_size, ..., dim) of embeddings for which to mitigate bias.
bias_direction : torch.Tensor
A unit tensor of size (dim, ) representing the concept subspace. The words that are used to define the bias direction are considered definitionally gendered and not modified.
equalize_embeddings1 : torch.Tensor
A tensor of size (equalize_batch_size, ..., dim) containing equalize word embeddings related to a group from the concept represented by bias_direction. For example, if the concept is gender, equalize_embeddings1 could contain embeddings for "boy", "man", "dad", "brother", etc.
equalize_embeddings2 : torch.Tensor
A tensor of size (equalize_batch_size, ..., dim) containing equalize word embeddings related to a different group for the same concept. For example, equalize_embeddings2 could contain embeddings for "girl", "woman", "mom", "sister", etc.

Note

The embeddings at the same positions in each of equalize_embeddings1 and equalize_embeddings2 are expected to form equalize word pairs. For example, if the concept is gender, the embeddings for ("boy", "girl"), ("man", "woman"), ("dad", "mom"), ("brother", "sister"), etc. should be at the same positions in equalize_embeddings1 and equalize_embeddings2.

Note

evaluation_embeddings, equalize_embeddings1, and equalize_embeddings2 must have same size except for 0th dim (i.e. batch dimension).

Note

Please ensure that the words in evaluation_embeddings, equalize_embeddings1, and equalize_embeddings2 and those used to compute bias_direction are disjoint.

Note

All tensors are expected to be on the same device.

Returns¶

bias_mitigated_embeddings : torch.Tensor
A tensor of the same size as evaluation_embeddings, equalize_embeddings1, and equalize_embeddings2 (in this order) stacked.

LinearBiasMitigator¶

class LinearBiasMitigator(BiasMitigator)

Linear bias mitigator. Mitigates bias in embeddings by removing component in the bias direction.

Note

For a detailed walkthrough and visual descriptions of the steps, please refer to Figure 3 in VERB: Visualizing and Interpreting Bias Mitigation Techniques for Word Representations.

Based on: S. Dev and J. M. Phillips. Attenuating bias in word vectors. In International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, pages 879–887. PMLR, 2019.

Implementation and terminology based on Rathore, A., Dev, S., Phillips, J.M., Srikumar, V., Zheng, Y., Yeh, C.M., Wang, J., Zhang, W., & Wang, B. (2021). VERB: Visualizing and Interpreting Bias Mitigation Techniques for Word Representations. ArXiv, abs/2104.02797.

call¶

class LinearBiasMitigator(BiasMitigator):
 | ...
 | def __call__(
 |     self,
 |     evaluation_embeddings: torch.Tensor,
 |     bias_direction: torch.Tensor
 | )

Note

In the examples below, we treat gender identity as binary, which does not accurately characterize gender in real life.

Parameters¶

evaluation_embeddings : torch.Tensor
A tensor of size (batch_size, ..., dim) of embeddings for which to mitigate bias.
bias_direction : torch.Tensor
A unit tensor of size (dim, ) representing the concept subspace.

Note

All tensors are expected to be on the same device.

Returns¶

bias_mitigated_embeddings : torch.Tensor
A tensor of the same size as evaluation_embeddings.

INLPBiasMitigator¶

class INLPBiasMitigator(BiasMitigator):
 | def __init__(self)

Iterative Nullspace Projection. It mitigates bias by repeatedly building a linear classifier that separates concept groups and linearly projecting all words along the classifier normal.

Note

For a detailed walkthrough and visual descriptions of the steps, please refer to Figure 5 in VERB: Visualizing and Interpreting Bias Mitigation Techniques for Word Representations.

Based on: Ravfogel, S., Elazar, Y., Gonen, H., Twiton, M., & Goldberg, Y. (2020). Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection. ArXiv, abs/2004.07667.

Implementation and terminology based on Rathore, A., Dev, S., Phillips, J.M., Srikumar, V., Zheng, Y., Yeh, C.M., Wang, J., Zhang, W., & Wang, B. (2021). VERB: Visualizing and Interpreting Bias Mitigation Techniques for Word Representations. ArXiv, abs/2104.02797.

call¶

class INLPBiasMitigator(BiasMitigator):
 | ...
 | def __call__(
 |     self,
 |     evaluation_embeddings: torch.Tensor,
 |     seed_embeddings1: torch.Tensor,
 |     seed_embeddings2: torch.Tensor,
 |     num_iters: int = 35
 | )

Parameters¶

Note

In the examples below, we treat gender identity as binary, which does not accurately characterize gender in real life.

evaluation_embeddings : torch.Tensor
A tensor of size (evaluation_batch_size, ..., dim) of embeddings for which to mitigate bias.
seed_embeddings1 : torch.Tensor
A tensor of size (embeddings1_batch_size, ..., dim) containing seed word embeddings related to a specific concept group. For example, if the concept is gender, seed_embeddings1 could contain embeddings for linguistically masculine words, e.g. "man", "king", "brother", etc.
seed_embeddings2 : torch.Tensor
A tensor of size (embeddings2_batch_size, ..., dim) containing seed word embeddings related to a different group for the same concept. For example, seed_embeddings2 could contain embeddings for linguistically feminine words, , e.g. "woman", "queen", "sister", etc.
num_iters : torch.Tensor
Number of times to build classifier and project embeddings along normal.

Note

seed_embeddings1 and seed_embeddings2 need NOT be the same size. Furthermore, the embeddings at the same positions in each of seed_embeddings1 and seed_embeddings2 are NOT expected to form seed word pairs.

Note

All tensors are expected to be on the same device.

Note

This bias mitigator is not differentiable.

Returns¶

bias_mitigated_embeddings : torch.Tensor
A tensor of the same size as evaluation_embeddings.

OSCaRBiasMitigator¶

class OSCaRBiasMitigator(BiasMitigator)

OSCaR bias mitigator. Mitigates bias in embeddings by dissociating concept subspaces through subspace orthogonalization. Formally, OSCaR applies a graded rotation on the embedding space to rectify two ideally-independent concept subspaces so that they become orthogonal.

Note

For a detailed walkthrough and visual descriptions of the steps, please refer to Figure 6 in VERB: Visualizing and Interpreting Bias Mitigation Techniques for Word Representations.

Based on: Dev, S., Li, T., Phillips, J.M., & Srikumar, V. (2020). OSCaR: Orthogonal Subspace Correction and Rectification of Biases in Word Embeddings. ArXiv, abs/2007.00049.

Implementation and terminology based on Rathore, A., Dev, S., Phillips, J.M., Srikumar, V., Zheng, Y., Yeh, C.M., Wang, J., Zhang, W., & Wang, B. (2021). VERB: Visualizing and Interpreting Bias Mitigation Techniques for Word Representations. ArXiv, abs/2104.02797.

call¶

class OSCaRBiasMitigator(BiasMitigator):
 | ...
 | def __call__(
 |     self,
 |     evaluation_embeddings: torch.Tensor,
 |     bias_direction1: torch.Tensor,
 |     bias_direction2: torch.Tensor
 | )

Parameters¶

evaluation_embeddings : torch.Tensor
A tensor of size (batch_size, ..., dim) of embeddings for which to mitigate bias.
bias_direction1 : torch.Tensor
A unit tensor of size (dim, ) representing a concept subspace (e.g. gender).
bias_direction2 : torch.Tensor
A unit tensor of size (dim, ) representing another concept subspace from which bias_direction1 should be dissociated (e.g. occupation).

Note

All tensors are expected to be on the same device.

Returns¶

bias_mitigated_embeddings : torch.Tensor
A tensor of the same size as evaluation_embeddings.

bias_mitigators

BiasMitigator¶

Parameters¶

HardBiasMitigator¶

__call__¶

Parameters¶

Returns¶

LinearBiasMitigator¶

__call__¶

Parameters¶

Returns¶

INLPBiasMitigator¶

__call__¶

Parameters¶

Returns¶

OSCaRBiasMitigator¶

__call__¶

Parameters¶

Returns¶

call¶

call¶

call¶

call¶