In this project, we compared Spanish BERT and Multilingual BERT in the Sentiment Analysis task.

Last update: Jan 03, 2022

Overview

Applying BERT Fine Tuning to Sentiment Classification on Amazon Reviews

Abstract

Sentiment analysis has made great progress in recent years, due to the fact that companies want to have a better understanding of how their products are classified by their consumers. However, despite the great advances that emerge in the field of artificial intelligence to solve this task, the most robust models are found in the English language. In the present work, we compare two Artificial Intelligence models that have monolingual and Multilingual approaches, which are Spanish BERT and Multilingual BERT, models based on BERT's transformer Architecture, to which the fine tuned technique was applied for the task of Sentiment analysis on the Amazon reviews dataset in Spanish using the accuracy and F1 score metrics. Finally, it was found that the Spanish BERT model has the best results for the sentiment analysis task on the Amazon reviews dataset in Spanish.

this paper is available here

Pipeline

Prerequisites

Linux / Window
Python3

Clone this Repository

git clone https://github.com/alexliqu09/Sentiment-Analysis-on-Amazon-Reviews.git

Train model

If you want to train the models use the colab Notebooks

Beto
MBert

Run the work in local

If you want to proof the work , you should run the following commands:

First , Install requeriments file:

pip install -r requeriments.txt

Second , download the Weights of Beto & MBERT and put them in this directory
Third , Start Streamlit server:

streamlit run main.py

Note:

Local host : http://localhost:8501 
Network URL:  http://192.168.0.5:8501

Run with Docker 🐋

#Bulding docker image 

docker build -t bert .

#RUN container
docker run -t -p 5000:5000 --name betocontainer bert

open http://172.17.0.2:8501

If you find useful our work , please cite this paper:

@inproceedings{@lvrBERT,
  title={Applying BERT Fine Tuning to Sentiment Classification on Amazon Reviews},
  author={Lique, Alexander and Vásquez, Diego and Rios, Manuel },
  year={2021}
}

In this project, we compared Spanish BERT and Multilingual BERT in the Sentiment Analysis task.

Related tags

Overview

Applying BERT Fine Tuning to Sentiment Classification on Amazon Reviews

Abstract

Pipeline

Prerequisites

Clone this Repository

Train model

Run the work in local

Run with Docker 🐋

Owner

Alexander Leonardo Lique Lamas

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)

This repository contains the codes for LipGAN. LipGAN was published as a part of the paper titled "Towards Automatic Face-to-Face Translation".

Script to generate VAD dataset used in Asteroid recipe

MEDIALpy: MEDIcal Abbreviations Lookup in Python

Tools to download and cleanup Common Crawl data

Based on 125GB of data leaked from Twitch, you can see their monthly revenues from 2019-2021

Sapiens is a human antibody language model based on BERT.

Pytorch-version BERT-flow: One can apply BERT-flow to any PLM within Pytorch framework.

Paddle2.x version AI-Writer

This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 - treatments and vaccinations.

本项目是作者们根据个人面试和经验总结出的自然语言处理(NLP)面试准备的学习笔记与资料，该资料目前包含自然语言处理各领域的面试题积累。

The PyTorch based implementation of continuous integrate-and-fire (CIF) module.

NLP and Text Generation Experiments in TensorFlow 2.x / 1.x

Tokenizer - Module python d'analyse syntaxique et de grammaire, tokenization

基于GRU网络的句子判断程序/A program based on GRU network for judging sentences

An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

nlp-tutorial is a tutorial for who is studying NLP(Natural Language Processing) using Pytorch

This repository contains the code for "Generating Datasets with Pretrained Language Models".

🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

In this project, we compared Spanish BERT and Multilingual BERT in the Sentiment Analysis task.

Related tags

Overview

Applying BERT Fine Tuning to Sentiment Classification on Amazon Reviews

Abstract

Pipeline

Prerequisites

Clone this Repository

Train model

Run the work in local

Run with Docker 🐋

Owner

Alexander Leonardo Lique Lamas

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)

This repository contains the codes for LipGAN. LipGAN was published as a part of the paper titled "Towards Automatic Face-to-Face Translation".

Script to generate VAD dataset used in Asteroid recipe

MEDIALpy: MEDIcal Abbreviations Lookup in Python

Tools to download and cleanup Common Crawl data

Based on 125GB of data leaked from Twitch, you can see their monthly revenues from 2019-2021

Sapiens is a human antibody language model based on BERT.

Pytorch-version BERT-flow: One can apply BERT-flow to any PLM within Pytorch framework.

Paddle2.x version AI-Writer

This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 - treatments and vaccinations.

本项目是作者们根据个人面试和经验总结出的自然语言处理(NLP)面试准备的学习笔记与资料，该资料目前包含 自然语言处理各领域的 面试题积累。

The PyTorch based implementation of continuous integrate-and-fire (CIF) module.

NLP and Text Generation Experiments in TensorFlow 2.x / 1.x

Tokenizer - Module python d'analyse syntaxique et de grammaire, tokenization

基于GRU网络的句子判断程序/A program based on GRU network for judging sentences

An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

nlp-tutorial is a tutorial for who is studying NLP(Natural Language Processing) using Pytorch

This repository contains the code for "Generating Datasets with Pretrained Language Models".

🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

本项目是作者们根据个人面试和经验总结出的自然语言处理(NLP)面试准备的学习笔记与资料，该资料目前包含自然语言处理各领域的面试题积累。