Learning where to learn - Gradient sparsity in meta and continual learning

Last update: Dec 09, 2022

Related tags

Overview

Learning where to learn - Gradient sparsity in meta and continual learning

In this paper, we investigate gradient sparsity found by MAML in various continual and few-shot learning scenarios.
Instead of only learning the initialization of neural network parameters, we additionally meta-learn parameters underneath a step function that stops gradient descent when smaller then 0.

We term this version Sparse-MAML - Link to the paper here.

Interestingly, we see that structured sparsity emerges in both the classic 4-layer ConvNet as well as a ResNet-12 for few-shot learning. This is accompanied by improved robustness and generalisation across many hyperparameters.

Note that Sparse-MAML is an extremely simple variant of MAML that possesses only the possibility to shut on/off training of specific parameters compared to proper gradient modulation.

This codebase implents the few-shot learning experiments that are presented in the paper. To reproduce the results in the paper, please follow these instructions:

Installation

#1. Install a conda env:

conda create -n sparse-MAML

#2. Activate the env:

source activate sparse-MAML

#3. Install anaconda:

conda install anaconda

#4. Install extra requiremetns (make sure you use the correct pip3):

pip3 install -r requirements.txt

#5. Run:

chmod u+x run_sparse_MAML.sh

#6. Execute:

./run_sparse_MAML.sh

Results

MiniImageNet Few-Shot	MAML	ANIL	BOIL	sparse-MAML	sparse-ReLU-MAML
5-way 5-shot \| ConvNet	63.15	61.50	66.45	67.03	64.84
5-way 1-shot \| ConvNet	48.07	46.70	49.61	50.35	50.39
5-way 5-shot \| ResNet12	69.36	70.03	70.50	70.02	73.01
5-way 1-shot \| ResNet12	53.91	55.25	-	55.02	56.39

BOIL results are taken from the original paper.

This code based is heavily build on top of torchmeta.

Learning where to learn - Gradient sparsity in meta and continual learning

Related tags

Overview

Learning where to learn - Gradient sparsity in meta and continual learning

Installation

Results

Owner

Johannes Oswald

PyTorch implementation of the paper Dynamic Data Augmentation with Gating Networks

Code and Resources for the Transformer Encoder Reasoning Network (TERN)

Code for the Shortformer model, from the paper by Ofir Press, Noah A. Smith and Mike Lewis.

Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

Generating Radiology Reports via Memory-driven Transformer

JFB: Jacobian-Free Backpropagation for Implicit Models

The toolkit to generate auto labeled datasets

3ds-Ghidra-Scripts - Ghidra scripts to help with 3ds reverse engineering

Implementation of the GBST block from the Charformer paper, in Pytorch

Open-Ended Commonsense Reasoning (NAACL 2021)

Empowering journalists and whistleblowers

basic tutorial on pytorch

[AI6122] Text Data Management & Processing

MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space

The first dataset on shadow generation for the foreground object in real-world scenes.

This is the code related to "Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation" (ICCV 2021).

Official pytorch implementation of paper Dual-Level Collaborative Transformer for Image Captioning (AAAI 2021).

Avalanche RL: an End-to-End Library for Continual Reinforcement Learning

Implementations of LSTM: A Search Space Odyssey variants and their training results on the PTB dataset.

Boundary IoU API (Beta version)