moving_average
allennlp.training.moving_average
NamedParameter#
NamedParameter = Tuple[str, torch.Tensor]
MovingAverage#
class MovingAverage(Registrable):
| def __init__(self, parameters: Iterable[NamedParameter]) -> None
Tracks a moving average of model parameters.
default_implementation#
class MovingAverage(Registrable):
| ...
| default_implementation = "exponential"
apply#
class MovingAverage(Registrable):
| ...
| def apply(self, num_updates: Optional[int] = None)
Update the moving averages based on the latest values of the parameters.
assign_average_value#
class MovingAverage(Registrable):
| ...
| def assign_average_value(self) -> None
Replace all the parameter values with the averages. Save the current parameter values to restore later.
restore#
class MovingAverage(Registrable):
| ...
| def restore(self) -> None
Restore the backed-up (non-average) parameter values.
ExponentialMovingAverage#
@MovingAverage.register("exponential")
class ExponentialMovingAverage(MovingAverage):
| def __init__(
| self,
| parameters: Iterable[NamedParameter],
| decay: float = 0.9999,
| numerator: float = 1.0,
| denominator: float = 10.0
| ) -> None
Create shadow variables and maintain exponential moving average for model parameters.
Registered as a MovingAverage
with name "exponential".
Parameters
-
parameters :
Iterable[Tuple[str, Parameter]]
The parameters whose averages we'll be tracking.In a typical AllenNLP configuration file, this argument does not get an entry under the "moving_average", it gets passed in separately. - decay :
float
, optional (default =0.9999
)
The decay rate that will be used ifnum_updates
is not passed (and that will be used as an upper bound ifnum_updates
is passed). - numerator :float
, optional (default =1.0
)
The numerator used to compute the decay rate ifnum_updates
is passed. - denominator :float
, optional (default =10.0
)
The denominator used to compute the decay rate ifnum_updates
is passed.
apply#
class ExponentialMovingAverage(MovingAverage):
| ...
| def apply(self, num_updates: Optional[int] = None) -> None
Apply exponential moving average to named_parameters
if specified,
or we will apply this to all the trainable parameters of the model.
The optional num_updates
parameter allows one to tweak the decay rate
dynamically. If passed, the actual decay rate used is:
`min(decay, (numerator + num_updates) / (denominator + num_updates))`
(This logic is based on the Tensorflow exponential moving average https://www.tensorflow.org/api_docs/python/tf/train/ExponentialMovingAverage)