noam
allennlp.training.learning_rate_schedulers.noam
NoamLR#
@LearningRateScheduler.register("noam")
class NoamLR(LearningRateScheduler):
| def __init__(
| self,
| optimizer: torch.optim.Optimizer,
| model_size: int,
| warmup_steps: int,
| factor: float = 1.0,
| last_epoch: int = -1
| ) -> None
Implements the Noam Learning rate schedule. This corresponds to increasing the learning rate
linearly for the first warmup_steps
training steps, and decreasing it thereafter proportionally
to the inverse square root of the step number, scaled by the inverse square root of the
dimensionality of the model. Time will tell if this is just madness or it's actually important.
Registered as a LearningRateScheduler
with name "noam".
Parameters
- optimizer :
torch.optim.Optimizer
This argument does not get an entry in a configuration file for the object. - model_size :
int
The hidden size parameter which dominates the number of parameters in your model. - warmup_steps :
int
The number of steps to linearly increase the learning rate. - factor :
float
, optional (default =1.0
)
The overall scale factor for the learning rate decay.
step#
class NoamLR(LearningRateScheduler):
| ...
| @overrides
| def step(self, metric: float = None) -> None
step_batch#
class NoamLR(LearningRateScheduler):
| ...
| def step_batch(self, batch_num_total: int = None) -> None
get_values#
class NoamLR(LearningRateScheduler):
| ...
| def get_values(self)