Source code for Transformer-based Multi-task Learning for Disaster Tweet Categorisation (UCD's participation in TREC-IS 2020A, 2020B and 2021A).

Last update: Oct 19, 2021

Overview

Source code for "UCD participation in TREC-IS 2020A, 2020B and 2021A".

*** update at: 2021/05/25

This repo so far relates to the following work:

Transformer-based Multi-task Learning for Disaster Tweet Categorisation, (WiP paper, ISCRAM 2021)
Multi-task transfer learning for finding actionable information from crisis-related messages on social media, (paper, TREC 2020)

Setup

git clone https://github.com/wangcongcong123/crisis-mtl.git
pip install -r requirements.txt

Dataset preparation

Download the splits prepared for the system from here that contains three subdirectories for 2020a, 2020b and 2021a respectively.
Unzip the file to data/.

Training and submitting

# for 2020a
python run.py --dataset_name 2020a --model_name bert-base-uncased

# or for 2020b
python run.py --edition 2020b --model_name bert-base-uncased
python run.py --edition 2020b --model_name google/electra-base-discriminator
python run.py --edition 2020b --model_name microsoft/deberta-base
python run.py --edition 2020b --model_name distilbert-base-uncased
python submit_ensemble.py --edition 2020b


# or for 2021a
python run.py --edition 2021a --model_name bert-base-uncased
python run.py --edition 2021a --model_name google/electra-base-discriminator
python run.py --edition 2021a --model_name microsoft/deberta-base
python run.py --edition 2021a --model_name distilbert-base-uncased
python submit_ensemble.py --edition 2021a

To see our results compared to other participating runs in 2020a and 2020b, check the appendix of this overview paper. To know the details of our approach, check this ISCRAM-2021 paper on 2020a and this TREC-2020 paper on 2020b. The evaluation for 2021a is still in process so the results will be added as soon as they come out.

Citation

If you use the code in your research, please consider citing the following papers:

@article{wang2021,
author = {Wang, Congcong and Nulty, Paul and Lillis, David},
journal = {Proceedings of the International ISCRAM Conference},
keywords = {18th International Conference on Information Systems for Crisis Response and Management (ISCRAM 2021)},
number = {May},
title = {{Transformer-based Multi-task Learning for Disaster Tweet Categorisation}},
volume = {2021-May},
year = {2021}
}

@inproceedings{congcong2020multi,
 address = {Gaithersburg, MD},
 title = {Multi-task transfer learning for finding actionable information from crisis-related messages on social media},
 booktitle = {Proceedings of the Twenty-Nineth {{Text REtrieval Conference}} ({{TREC}} 2020)},
 author = {Wang, Congcong and Lillis, David},
 year = {2020},
}

Queries

Let me know if any questions via [email protected] or through creating an issue.

Source code for Transformer-based Multi-task Learning for Disaster Tweet Categorisation (UCD's participation in TREC-IS 2020A, 2020B and 2021A).

Related tags

Overview

Source code for "UCD participation in TREC-IS 2020A, 2020B and 2021A".

Setup

Dataset preparation

Training and submitting

Citation

Queries

Owner

Congcong Wang

From Canonical Correlation Analysis to Self-supervised Graph Neural Networks

Human head pose estimation using Keras over TensorFlow.

MWPToolkit is a PyTorch-based toolkit for Math Word Problem (MWP) solving.

Code and Resources for the Transformer Encoder Reasoning Network (TERN)

Semi-Supervised Learning with Ladder Networks in Keras. Get 98% test accuracy on MNIST with just 100 labeled examples !

An easy way to build PyTorch datasets. Modularly build datasets and automatically cache processed results

TensorFlow Metal Backend on Apple Silicon Experiments (just for fun)

This is the official PyTorch implementation for "Mesa: A Memory-saving Training Framework for Transformers".

The source code for CATSETMAT: Cross Attention for Set Matching in Bipartite Hypergraphs

A toolset of Python programs for signal modeling and indentification via sparse semilinear autoregressors.

(to be released) [NeurIPS'21] Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs

The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color

Exploring the link between uncertainty estimates obtained via "exact" Bayesian inference and out-of-distribution (OOD) detection.

Robust fine-tuning of zero-shot models

Official implementation of "Learning Not to Reconstruct" (BMVC 2021)

CCNet: Criss-Cross Attention for Semantic Segmentation (TPAMI 2020 & ICCV 2019).

Oriented Response Networks, in CVPR 2017

Code of 3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces

SmallInitEmb - LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence

This is the repository for Learning to Generate Piano Music With Sustain Pedals