An end-to-end machine learning library to directly optimize AUC loss

Overview

LibAUC

An end-to-end machine learning library for AUC optimization.

Why LibAUC?

Deep AUC Maximization (DAM) is a paradigm for learning a deep neural network by maximizing the AUC score of the model on a dataset. There are several benefits of maximizing AUC score over minimizing the standard losses, e.g., cross-entropy.

  • In many domains, AUC score is the default metric for evaluating and comparing different methods. Directly maximizing AUC score can potentially lead to the largest improvement in the model’s performance.
  • Many real-world datasets are usually imbalanced . AUC is more suitable for handling imbalanced data distribution since maximizing AUC aims to rank the predication score of any positive data higher than any negative data

Links

Installation

$ pip install libauc

Usage

Official Tutorials:

  • 01.Creating Imbalanced Benchmark Datasets [Notebook][Script]
  • 02.Training ResNet20 with Imbalanced CIFAR10 [Notebook][Script]
  • 03.Training with Pytorch Learning Rate Scheduling [Notebook][Script]
  • 04.Training with Imbalanced Datasets on Distributed Setting [Coming soon]

Quickstart for beginner:

>>> #import library
>>> from libauc.losses import AUCMLoss
>>> from libauc.optimizers import PESG
...
>>> #define loss
>>> Loss = AUCMLoss(imratio=0.1)
>>> optimizer = PESG(imratio=0.1)
...
>>> #training
>>> model.train()    
>>> for data, targets in trainloader:
>>>	data, targets  = data.cuda(), targets.cuda()
        preds = model(data)
        loss = Loss(preds, targets) 
        optimizer.zero_grad()
        loss.backward(retain_graph=True)
        optimizer.step()
...	
>>> #restart stage
>>> optimizer.update_regularizer()		
...   
>>> #evaluation
>>> model.eval()    
>>> for data, targets in testloader:
	data, targets  = data.cuda(), targets.cuda()
        preds = model(data)

Please visit our website or github for more examples.

Citation

If you find LibAUC useful in your work, please cite the following paper:

@article{yuan2020robust,
title={Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification},
author={Yuan, Zhuoning and Yan, Yan and Sonka, Milan and Yang, Tianbao},
journal={arXiv preprint arXiv:2012.03173},
year={2020}
}

Contact

If you have any questions, please contact us @ Zhuoning Yuan [[email protected]] and Tianbao Yang [[email protected]] or please open a new issue in the Github.

Comments
  • Only compatible with Nvidia GPU

    Only compatible with Nvidia GPU

    I tried running the example tutorial but I got the following error. ''' AssertionError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx '''

    opened by Beckham45 2
  • Extend to Multi-class Classification Task and Be compatible with PyTorch scheduler

    Extend to Multi-class Classification Task and Be compatible with PyTorch scheduler

    Hi Zhuoning,

    This is an interesting work! I am wondering if the DAM method can be extended to a multi-class classification task with long-tailed imbalanced data. Intuitively, this should be possible as the famous sklearn tool provides auc score for multi-class setting by using one-versus-rest or one-versus-one technique.

    Besides, it seems that optimizer.update_regularizer() is called only when the learning rate is reduced, thus it would be more elegant to incorporate this functional call into a pytorch lr scheduler. E.g.,

    scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer)
    scheduler.step()    # override the step to fulfill: optimizer.update_regularizer()
    
    

    For current libauc version, the PESG optimizer is not compatible with schedulers in torch.optim.lr_scheduler . It would be great if this feature can be supported in the future.

    Thanks for your work!

    opened by Cogito2012 2
  • When to use retain_graph=True?

    When to use retain_graph=True?

    Hi,

    When to use retain_graph=True in the loss backward function?

    In 2 examples (2 and 4), it is True. But not in the other examples.

    I appreciate your time.

    opened by dfrahman 1
  • Using AUCMLoss with imratio>1

    Using AUCMLoss with imratio>1

    I'm not very familiar with the maths in the paper so please forgive me if i'm asking something obvious.

    The AUCMLoss uses the "imbalance ratio" between positive and negative samples. The ratio is defined as

    the ratio of # of positive examples to the # of negative examples

    Or imratio=#pos/#neg

    When #pos<#neg, imratio is some value between 0 and 1. But when #pos>#neg, imratio>1

    Will this break the loss calculations? I have a feeling it would invalidate the many 1-self.p calculations in the LibAUC implementation, but as i'm not familiar with the maths I can't say for sure.

    Also, is there a problem (mathematically speaking) with calculating imratio=#pos/#total_samples, to avoid the problem above? When #pos<<#neg, #neg approximates #total_samples.

    opened by ayhyap 1
  • AUCMLoss does not use margin argument

    AUCMLoss does not use margin argument

    I noticed in the AUCMLoss class that the margin argument is not used. Following the formulation in the paper, the forward function should be changed in line 20 from 2*self.alpha*(self.p*(1-self.p) + \ to 2*self.alpha*(self.p*(1-self.p)*self.margin + \

    opened by ayhyap 1
  • How to train multi-label classification tasks? (like chexpert)

    How to train multi-label classification tasks? (like chexpert)

    I have started using this library and I've read your paper Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification, and I'm still not sure how to train a multi-label classification (MLC) model.

    Specifically, how did you fine-tune for the Chexpert multi-label classification task? (i.e. classify 5 diseases, where each image may have presence of 0, 1 or more diseases)

    • The first step pre-training with Cross-entropy loss seems clear to me
    • You mention: "In the second step of AUC maximization, we replace the last classifier layer trained in the first step by random weights and use our DAM method to optimize the last classifier layer and all previous layers.". The new classifier layer is a single or multi-label classifier?
    • In the Appendix I, figure 7 shows only one score as output for Deep AUC maximization (i.e. only one disease)
    • In the code, both AUCMLoss() and APLoss_SH() receive single-label outputs, not multi-label outputs, apparently

    How do you train for the 5 diseases? Train sequentially Cardiomegaly, then Edema, and so on? or with 5 losses added up? or something else?

    opened by pdpino 4
  • Example for tensorflow

    Example for tensorflow

    Thank you for the great library. Does it currently support tensorflow? If so, could you provide an example of how it can be used with tensorflow? Thank you very much

    opened by Kokkini 1
Releases(1.1.4)
  • 1.1.4(Jul 26, 2021)

    What's New

    • Added PyTorch dataloader for CheXpert dataset. Tutorial for training CheXpert is available here.
    • Added support for training AUC loss on CPU machines. Note that please remove lines with .cuda() from the code.
    • Fixed some bugs and improved the training stability
    Source code(tar.gz)
    Source code(zip)
  • 1.1.3(Jun 16, 2021)

  • 1.1.2(Jun 14, 2021)

    What's New

    1. Add SOAP optimizer contributed by @qiqi-helloworld @yzhuoning for optimizing AUPRC. Please check the tutorial here.
    2. Update ResNet18, ResNet34 with pretrained models on ImageNet1K
    3. Add new strategy for AUCM Loss: imratio is calculated over a mini-batch if initial value is not given
    4. Fixed some bugs and improved the training stability
    Source code(tar.gz)
    Source code(zip)
  • V1.1.0(May 10, 2021)

    What's New:

    • Fixed some bugs and improved the training stability
    • Changed default settings in loss function for binary labels to be 0 and 1
    • Added Pytorch dataloaders for CIFAR10, CIFAR100, CAT_vs_Dog, STL10
    • Enabled training DAM with Pytorch leanring scheduler, e.g., ReduceLROnPlateau, CosineAnnealingLR
    Source code(tar.gz)
    Source code(zip)
StocksMA is a package to facilitate access to financial and economic data of Moroccan stocks.

Creating easier access to the Moroccan stock market data What is StocksMA ? StocksMA is a package to facilitate access to financial and economic data

Salah Eddine LABIAD 28 Jan 04, 2023
Deep Learning Emotion decoding using EEG data from Autism individuals

Deep Learning Emotion decoding using EEG data from Autism individuals This repository includes the python and matlab codes using for processing EEG 2D

Juan Manuel Mayor Torres 12 Dec 08, 2022
Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

Complex-Valued Neural Networks (CVNN) Done by @NEGU93 - J. Agustin Barrachina Using this library, the only difference with a Tensorflow code is that y

youceF 1 Nov 12, 2021
Repository aimed at compiling code, papers, demos etc.. related to my PhD on 3D vision and machine learning for fruit detection and shape estimation at the university of Lincoln

PhD_3DPerception Repository aimed at compiling code, papers, demos etc.. related to my PhD on 3D vision and machine learning for fruit detection and s

lelouedec 2 Oct 06, 2022
Torch implementation of various types of GAN (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN, LSGAN)

gans-collection.torch Torch implementation of various types of GANs (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN). Note that EBGAN and

Minchul Shin 53 Jan 22, 2022
PyTorch implementation of "A Simple Baseline for Low-Budget Active Learning".

A Simple Baseline for Low-Budget Active Learning This repository is the implementation of A Simple Baseline for Low-Budget Active Learning. In this pa

10 Nov 14, 2022
This repository collects project-relevant Isabelle/HOL formalizations.

Isabelle/HOL formalizations related to the AuReLeE project Formalization of Abstract Argumentation Frameworks See AbstractArgumentation folder for the

AuReLeE project 1 Sep 10, 2022
DeLiGAN - This project is an implementation of the Generative Adversarial Network

This project is an implementation of the Generative Adversarial Network proposed in our CVPR 2017 paper - DeLiGAN : Generative Adversarial Net

Video Analytics Lab -- IISc 110 Sep 13, 2022
Official implementation of the PICASO: Permutation-Invariant Cascaded Attentional Set Operator

PICASO Official PyTorch implemetation for the paper PICASO:Permutation-Invariant Cascaded Attentive Set Operator. Requirements Python 3 torch = 1.0 n

Samira Zare 0 Dec 23, 2021
PyTorch code for the ICCV'21 paper: "Always Be Dreaming: A New Approach for Class-Incremental Learning"

Always Be Dreaming: A New Approach for Data-Free Class-Incremental Learning PyTorch code for the ICCV 2021 paper: Always Be Dreaming: A New Approach f

49 Dec 21, 2022
PyTorch implementation of SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching

SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching This is the official PyTorch implementation of SMODICE: Versatile Offline I

Jason Ma 14 Aug 30, 2022
Boundary IoU API (Beta version)

Boundary IoU API (Beta version) Bowen Cheng, Ross Girshick, Piotr Dollár, Alexander C. Berg, Alexander Kirillov [arXiv] [Project] [BibTeX] This API is

Bowen Cheng 177 Dec 29, 2022
Lenia - Mathematical Life Forms

For full version list, see Timeline in Lenia portal [2020-10-13] Update Python version with multi-kernel and multi-channel extensions (v3.4 LeniaNDK.p

Bert Chan 3.1k Dec 28, 2022
This repository contains Prior-RObust Bayesian Optimization (PROBO) as introduced in our paper "Accounting for Gaussian Process Imprecision in Bayesian Optimization"

Prior-RObust Bayesian Optimization (PROBO) Introduction, TOC This repository contains Prior-RObust Bayesian Optimization (PROBO) as introduced in our

Julian Rodemann 2 Mar 19, 2022
SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

Wentao Zhu 24 May 20, 2022
level1-image-classification-level1-recsys-09 created by GitHub Classroom

level1-image-classification-level1-recsys-09 ❗ 주제 설명 COVID-19 Pandemic 상황 속 마스크 착용 유무 판단 시스템 구축 마스크 착용 여부, 성별, 나이 총 세가지 기준에 따라 총 18개의 class로 구분하는 모델 ?

6 Mar 17, 2022
RoboDesk A Multi-Task Reinforcement Learning Benchmark

RoboDesk A Multi-Task Reinforcement Learning Benchmark If you find this open source release useful, please reference in your paper: @misc{kannan2021ro

Google Research 66 Oct 07, 2022
Public scripts, services, and configuration for running a smart home K3S network cluster

makerhouse_network Public scripts, services, and configuration for running MakerHouse's home network. This network supports: TODO features here For mo

Scott Martin 1 Jan 15, 2022
Automatic Data-Regularized Actor-Critic (Auto-DrAC)

Auto-DrAC: Automatic Data-Regularized Actor-Critic This is a PyTorch implementation of the methods proposed in Automatic Data Augmentation for General

89 Dec 13, 2022
Official PyTorch implementation of the ICRA 2021 paper: Adversarial Differentiable Data Augmentation for Autonomous Systems.

Adversarial Differentiable Data Augmentation This repository provides the official PyTorch implementation of the ICRA 2021 paper: Adversarial Differen

Manli 3 Oct 15, 2022