CosineAnnealingWithWarmup

Formulation

The learning rate is annealed using a cosine schedule over the course of learning of n_total total steps with an initial warmup period of n_warmup steps. Hence, the learning rate at step i is computed as:

Learning rate will be changed as:

Usage

# optimizer, warmup_epochs, warmup_lr, num_epochs, base_lr, final_lr, iter_per_epoch
lr_scheduler = LR_Scheduler(
        optimizer,
        args.warmup_epochs, args.warmup_lr*args.batch_size/256, 
        args.epochs, args.lr*args.batch_size/256, args.final_lr*args.batch_size/256, 
        len(train_loader),
    )

for data in range(train_loader):
  optimizer.zero_grad()
  output = model(data) 
  loss = lossfunc(output,gt)
  loss.backward()
  optimizer.step()
  lr_scheduler.step()

In CV domain [1,2], in order to automatically adapt different batch size you can use a learning rate of lr×BatchSize/256 (linear scaling [4])(we can use larger learning rate while adopting larger batch size, especially, when you use LARS optimizer[3]). Of course, you can modify it according to your specific requirements.

Reference

[1] Jean-Bastien Grill, Florian Strub, Florent Altch´e, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Do- ersch, Bernardo Avila Pires, Zhaohan Daniel Guo, Moham- mad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, R´emi Munos, and Michal Valko. Bootstrap your own latent: A new approach to self-supervised learning. arXiv:2006.07733v1, 2020.
[2] Chen X, He K. Exploring simple siamese representation learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 15750-15758.
[3] Yang You, Igor Gitman, and Boris Ginsburg. Large batch training of convolutional networks. arXiv:1708.03888, 2017.
[4] Priya Goyal, Piotr Doll´ar, Ross Girshick, Pieter Noord- huis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. Accurate, large minibatch SGD: Training ImageNet in 1 hour. arXiv:1706.02677, 2017.
[5] https://github.com/PatrickHua/SimSiam?utm_source=catalyzex.com

Cosine Annealing With Warmup

Related tags

Overview

CosineAnnealingWithWarmup

Formulation

Usage

Reference

Owner

zhuyun

SalFBNet: Learning Pseudo-Saliency Distribution via Feedback Convolutional Networks

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators

Code for the RA-L (ICRA) 2021 paper "SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition"

A library to inspect itermediate layers of PyTorch models.

Super Resolution for images using deep learning.

Modifications of the official PyTorch implementation of StyleGAN3. Let's easily generate images and videos with StyleGAN2/2-ADA/3!

A hybrid framework (neural mass model + ML) for SC-to-FC prediction

The authors' implementation of Unsupervised Adversarial Learning of 3D Human Pose from 2D Joint Locations

Implementation for On Provable Benefits of Depth in Training Graph Convolutional Networks

Neural network graphs and training metrics for PyTorch, Tensorflow, and Keras.

Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

🌊 Online machine learning in Python

Data, model training, and evaluation code for "PubTables-1M: Towards a universal dataset and metrics for training and evaluating table extraction models".

Qt-GUI implementation of the YOLOv5 algorithm (ver.6 and ver.5)

Memory Defense: More Robust Classificationvia a Memory-Masking Autoencoder

InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images

IDRLnet, a Python toolbox for modeling and solving problems through Physics-Informed Neural Network (PINN) systematically.

ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information

Data-driven reduced order modeling for nonlinear dynamical systems