In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a model using HugginFace transformers framework.

Last update: Apr 13, 2022

Overview

Transformers are all you need

In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a model using HugginFace transformers framework.

Table of Content

The workshop will be divided into four parts

Introduction to Transformers as a HYPE
Sneak peek to the theory behind Transfomers
Quick tour (Huggingface framework)
Lab
- fine tune a translation model

Note that you can always open the notebooks on Google Colab ( No need to install anything ) you just need a stable internet connection :

- fine tune a translation model

2. How to get started

Fork this repository
Create a branch by your name
Go through the notebook and complete all tasks
Submit a pull request

Homework exercise

Your task is to fine-tune a classification model

Using HuggingFace transformers and datasets.
fine tune it to one of the classification task of the GLUE Benchmark(CoLa to be specific).
Use a checkpoint from the Hub ("distilbert-base-uncased" for example)
Once finished submit a pull request to this repo, make sure to place your .ipynb file in the submissions folder (YOUR_NAME.ipynb)

Useful ressources : text_classification

In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a model using HugginFace transformers framework.

Related tags

Overview

Transformers are all you need

Table of Content

Note that you can always open the notebooks on Google Colab ( No need to install anything ) you just need a stable internet connection :

2. How to get started

Homework exercise

Owner

Aymen Berriche

Python functions for summarizing and improving voice dictation input.

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

Code for paper "Which Training Methods for GANs do actually Converge? (ICML 2018)"

Application for shadowing Chinese.

German Text-To-Speech Engine using Tacotron and Griffin-Lim

hashily is a Python module that provides a variety of text decoding and encoding operations.

This library is testing the ethics of language models by using natural adversarial texts.

Python package for Turkish Language.

Toward a Visual Concept Vocabulary for GAN Latent Space, ICCV 2021

jiant is an NLP toolkit

KoBERT - Korean BERT pre-trained cased (KoBERT)

Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downstream tasks like translation and summarisation.

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

PyTorch implementation of Tacotron speech synthesis model.

Training RNNs as Fast as CNNs

Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning

Yes it's true :broken_heart:

XLNet: Generalized Autoregressive Pretraining for Language Understanding

This converter will create the exact measure for your cappuccino recipe from the grandiose Rafaella Ballerini!

🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools