towhee.trainer.optimization.optimization.get_polynomial_decay_schedule_with_warmup¶
- towhee.trainer.optimization.optimization.get_polynomial_decay_schedule_with_warmup(optimizer, num_warmup_steps, num_training_steps, lr_end=1e-07, power=1.0, last_epoch=-1)[source]¶
Create a schedule with a learning rate that decreases as a polynomial decay from the initial lr set in the optimizer to end lr defined by lr_end, after a warmup period during which it increases linearly from 0 to the initial lr set in the optimizer.
- Parameters:
optimizer (
Optimizer
) – The optimizer for which to schedule the learning rate.num_warmup_steps (
int
) – The number of steps for the warmup phase.num_training_steps (
int
) – The total number of training steps.lr_end (
float
, optional, defaults to 1e-7) – The end LR.power (
float
, optional, defaults to 1.0) – Power factor.last_epoch (
int
, optional, defaults to -1) – The index of the last epoch when resuming training.
Note: power defaults to 1.0 as in the fairseq implementation, which in turn is based on the original BERT implementation at https://github.com/google-research/bert/blob/f39e881b169b9d53bea03d2d341b31707a6c052b/optimization.py#L37
- Returns:
torch.optim.lr_scheduler.LambdaLR
with the appropriate schedule.