Text classification is one of the popular tasks in NLP that allows a program to classify free-text documents based on pre-defined classes.

Last update: Mar 17, 2022

Overview

Deep-Learning-for-Text-Document-Classification

Text classification is one of the popular tasks in NLP that allows a program to classify free-text documents based on pre-defined classes.

Why Text-Document Classification?

Today’s emergence of large digital documents makes the text classification task more crucial, especially for companies to maximize their workflow or even profits. Recently, the progress of NLP research on text classification has arrived at the state-of-the-art (SOTA). It has achieved terrific results, showing Deep Learning methods as the cutting-edge technology to perform such tasks. Hence, the need to assess the performance of the SOTA deep learning models for text classification is essential not only for academic purposes but also for AI practitioners or professionals that need guidance and benchmark on similar projects.

Owner

Happy N. Monday

GitHub Repository

TEACh is a dataset of human-human interactive dialogues to complete tasks in a simulated household environment.

98 Dec 09, 2022

Google's Meena transformer chatbot implementation

Here's my attempt at recreating Meena, a state of the art chatbot developed by Google Research and described in the paper Towards a Human-like Open-Domain Chatbot.

94 Dec 25, 2022

End-to-end text to speech system using gruut and onnx. There are 40 voices available across 8 languages.

End to end text to speech system using gruut and onnx

673 Dec 28, 2022

American Sign Language (ASL) to Text Converter

Signterpreter American Sign Language (ASL) to Text Converter Recommendations Although there is grayscale and gaussian blur, we recommend that you use

0 Feb 20, 2022

Code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

This repository contains the code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

28 Nov 10, 2022

ByT5: Towards a token-free future with pre-trained byte-to-byte models

ByT5: Towards a token-free future with pre-trained byte-to-byte models ByT5 is a tokenizer-free extension of the mT5 model. Instead of using a subword

409 Jan 06, 2023

Code for ACL 2021 main conference paper "Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances".

Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances This repository contains the code and pre-trained mode

90 Dec 27, 2022

Continuously update some NLP practice based on different tasks.

NLP_practice We will continuously update some NLP practice based on different tasks. prerequisites Software pytorch = 1.10 torchtext = 0.11.0 sklear

0 Jan 05, 2022

Takes a string and puts it through different languages in Google Translate a requested amount of times, returning nonsense.

PythonTextObfuscator Takes a string and puts it through different languages in Google Translate a requested amount of times, returning nonsense. Requi

2 Aug 29, 2022

FactSumm: Factual Consistency Scorer for Abstractive Summarization

FactSumm: Factual Consistency Scorer for Abstractive Summarization FactSumm is a toolkit that scores Factualy Consistency for Abstract Summarization W

83 Jan 09, 2023

Transformers implementation for Fall 2021 Clinic

Installation Download miniconda3 if not already installed You can check by running typing conda in command prompt. Use conda to create an environment

1 Oct 28, 2021

CoSENT 比Sentence-BERT更有效的句向量方案

201 Dec 12, 2022

A versatile token stream for handwritten parsers.

Writing recursive-descent parsers by hand can be quite elegant but it's often a bit more verbose than expected, especially when it comes to handling indentation and reporting proper syntax errors. Th

8 Nov 30, 2022

A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).

Rebiber: A tool for normalizing bibtex with official info. We often cite papers using their arXiv versions without noting that they are already PUBLIS

2k Jan 01, 2023

Creating a chess engine using GPT-3

GPT3Chess Creating a chess engine using GPT-3 Code for my article : https://towardsdatascience.com/gpt-3-play-chess-d123a96096a9 My game (white) vs GP

19 Dec 17, 2022

Python interface for converting Penn Treebank trees to Stanford Dependencies and Universal Depenencies

PyStanfordDependencies Python interface for converting Penn Treebank trees to Universal Dependencies and Stanford Dependencies. Example usage Start by

64 May 08, 2022

Pytorch-version BERT-flow: One can apply BERT-flow to any PLM within Pytorch framework.

59 Dec 01, 2022

Question answering app is used to answer for a user given question from user given text.

Question answering app is used to answer for a user given question from user given text.It is created using HuggingFace's transformer pipeline and streamlit python packages.

3 Apr 05, 2022

🏖 Easy training and deployment of seq2seq models.

Headliner Headliner is a sequence modeling library that eases the training and in particular, the deployment of custom sequence models for both resear

231 Nov 18, 2022

Implementation of TTS with combination of Tacotron2 and HiFi-GAN

Tacotron2-HiFiGAN-master Implementation of TTS with combination of Tacotron2 and HiFi-GAN for Mandarin TTS. Inference In order to inference, we need t

7 Nov 11, 2022

Text classification is one of the popular tasks in NLP that allows a program to classify free-text documents based on pre-defined classes.

Related tags

Overview

Deep-Learning-for-Text-Document-Classification

Why Text-Document Classification?

Owner

Happy N. Monday

TEACh is a dataset of human-human interactive dialogues to complete tasks in a simulated household environment.

Google's Meena transformer chatbot implementation

End-to-end text to speech system using gruut and onnx. There are 40 voices available across 8 languages.

American Sign Language (ASL) to Text Converter

Code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

ByT5: Towards a token-free future with pre-trained byte-to-byte models

Code for ACL 2021 main conference paper "Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances".

Continuously update some NLP practice based on different tasks.

Takes a string and puts it through different languages in Google Translate a requested amount of times, returning nonsense.

FactSumm: Factual Consistency Scorer for Abstractive Summarization

Transformers implementation for Fall 2021 Clinic

CoSENT 比Sentence-BERT更有效的句向量方案

A versatile token stream for handwritten parsers.

A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).

Creating a chess engine using GPT-3

Python interface for converting Penn Treebank trees to Stanford Dependencies and Universal Depenencies

Pytorch-version BERT-flow: One can apply BERT-flow to any PLM within Pytorch framework.

Question answering app is used to answer for a user given question from user given text.

🏖 Easy training and deployment of seq2seq models.

Implementation of TTS with combination of Tacotron2 and HiFi-GAN