polynomial_decay

allennlp.training.learning_rate_schedulers.polynomial_decay

PolynomialDecay¶

@LearningRateScheduler.register("polynomial_decay")
class PolynomialDecay(LearningRateScheduler):
 | def __init__(
 |     self,
 |     optimizer: torch.optim.Optimizer,
 |     num_epochs: int,
 |     num_steps_per_epoch: int,
 |     power=1.0,
 |     warmup_steps=0,
 |     end_learning_rate=0.0,
 |     last_epoch: int = -1
 | )

Implements polynomial decay Learning rate scheduling. The learning rate is first linearly increased for the first warmup_steps training steps. Then it is decayed for total_steps - warmup_steps from the initial learning rate to end_learning_rate using a polynomial of degree power.

Formally,

lr = (initial_lr - end_learning_rate) * ((total_steps - steps)/(total_steps - warmup_steps)) ** power

Parameters¶

optimizer : torch.optim.Optimizer
This argument does not get an entry in a configuration file for the object.
num_epochs : int
The number of epochs in the experiment. this does NOT get an entry in the config.
num_steps_per_epoch : int
The number of steps per epoch. this does NOT get an entry in the config.
warmup_steps : int
The number of steps to linearly increase the learning rate.
power : float, optional (default = 1.0)
The power of the polynomial used for decaying.
end_learning_rate : float, optional (default = 0.0)
Final learning rate to decay towards.

Example¶

Config for using the PolynomialDecay Learning Rate Scheduler with warmup_steps set 100, power set to 2, and end_learning_rate set to 1e-10.

{
    ...
   "trainer":{
        ...
        "learning_rate_scheduler": {
            "type": "polynomial_decay",
            "power": 2,
            "warmup_steps": 100,
            "end_learning_rate": 1e-10
        },
        ...
   }
}

Note that you do NOT pass a optimizer, num_epochs, nor num_steps_per_epoch key to the Learning rate scheduler.

get_values¶

class PolynomialDecay(LearningRateScheduler):
 | ...
 | def get_values(self)

step¶

class PolynomialDecay(LearningRateScheduler):
 | ...
 | def step(self, metric: float = None) -> None

step_batch¶

class PolynomialDecay(LearningRateScheduler):
 | ...
 | def step_batch(self, batch_num_total: int = None) -> None