Findings of ACL 2021

Last update: Feb 24, 2022

Overview

Assessing Dialogue Systems with Distribution Distances

We propose to measure the performance of a dialogue system by computing the distributionwise distance between its generated conversations and real-world conversations.

To appear in Findings of ACL 2021.

Note that this is not an officially supported Tencent product.

1. Configuratin

This repository requires the packages:

pytorch
huggingface/transformers.

2. Usage

To evaluate the system-level human correlations of metrics:

python eval_metric.py \
  --data_path ./datasets/convai2_annotation.json \
  --metric fbd \
  --sample_num 10 \
  --model_type roberta-base \
  --batch_size 32

Currently, our repo supports the common metrics used in text generation field, inclduing bleu, meteor, rouge, greedy, average, extrema, bert_score, fbd and prd.

Here are some details of the six corpura compared in the main paper:

File Name	Dataset Name	Num. of Samples	Reference
`personam_annotation.json`	Persona(M)	60	Shikib/usr
`dailyh_annotation.json`	Daily(H)	150	li3cmz/GRADE
`convai2_annotation.json`	Convai2	150	li3cmz/GRADE
`empathetic_annotation.json`	Empathetic	150	li3cmz/GRADE
`dailyz_annotation.json`	Daily(Z)	100	ZHAOTING/dialog-processing
`personaz_annotation.json`	Persona(Z)	150	ZHAOTING/dialog-processing

Citation

If you use this research/codebase/dataset, please cite our paper:

@article{xiang2021assessing,
  title={Assessing Dialogue Systems with Distribution Distances},
  author={Xiang, Jiannan and Liu, Yahui and Cai, Deng and Li, Huayang and Lian, Defu and Liu, Lemao},
  journal={arXiv preprint arXiv:2105.02573},
  year={2021}
}

Other related papers:

[1] FID, GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium, NIPS 2017
[2] PRD, Assessing Generative Models via Precision and Recall, NIPS 2018
[3] BERTScore, BERTScore: Evaluating Text Generation with BERT, ICLR 2020

Findings of ACL 2021

Related tags

Overview

Assessing Dialogue Systems with Distribution Distances

1. Configuratin

2. Usage

Citation

Owner

Yahui Liu

pyMorfologik MorfologikpyMorfologik - Python binding for Morfologik.

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

Code for the paper: Sequence-to-Sequence Learning with Latent Neural Grammars

Using context-free grammar formalism to parse English sentences to determine their structure to help computer to better understand the meaning of the sentence.

Fuzzy String Matching in Python

This project is part of Eleuther AI's quest to create a massive repository of high quality text data for training language models.

The Sudachi synonym dictionary in Solar format.

justCTF [*] 2020 challenges sources

All the code I wrote for Overwatch-related projects that I still own the rights to.

Graphical user interface for Argos Translate

QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries

This repo contains simple to use, pretrained/training-less models for speaker diarization.

Beyond Accuracy: Behavioral Testing of NLP models with CheckList

Différents programmes créant une interface graphique a l'aide de Tkinter pour simplifier la vie des étudiants.

Toward Model Interpretability in Medical NLP

Residual2Vec: Debiasing graph embedding using random graphs

SentAugment is a data augmentation technique for semi-supervised learning in NLP.

Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 < Tensorflow < 2.0

this repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

The ibet-Prime security token management system for ibet network.