# polynomial_decay

[ allennlp.training.learning_rate_schedulers.polynomial_decay ]

## PolynomialDecay#

``````class PolynomialDecay(LearningRateScheduler):
| def __init__(
|     self,
|     optimizer: torch.optim.Optimizer,
|     num_epochs: int,
|     num_steps_per_epoch: int,
|     power=1.0,
|     warmup_steps=0,
|     end_learning_rate=0.0,
|     last_epoch: int = -1
| )
``````

Implements polynomial decay Learning rate scheduling. The learning rate is first linearly increased for the first `warmup_steps` training steps. Then it is decayed for `total_steps` - `warmup_steps` from the initial learning rate to `end_learning_rate` using a polynomial of degree `power`.

Formally,

`lr` = (`initial_lr` - `end_learning_rate`) * ((`total_steps` - `steps`)/(`total_steps` - `warmup_steps`)) ** `power`

Parameters

• total_steps : `int`
The total number of steps to adjust the learning rate for.
• warmup_steps : `int`
The number of steps to linearly increase the learning rate.
• power : `float`, optional (default = `1.0`)
The power of the polynomial used for decaying.
• end_learning_rate : `float`, optional (default = `0.0`)
Final learning rate to decay towards.

### get_values#

``````class PolynomialDecay(LearningRateScheduler):
| ...
| @overrides
| def get_values(self)
``````

### step#

``````class PolynomialDecay(LearningRateScheduler):
| ...
| @overrides
| def step(self, metric: float = None) -> None
``````

### step_batch#

``````class PolynomialDecay(LearningRateScheduler):
| ...
| @overrides
| def step_batch(self, batch_num_total: int = None) -> None
``````