Code for EMNLP2020 long paper: BERT-Attack: Adversarial Attack Against BERT Using BERT

Last update: Jan 04, 2023

Related tags

Deep Learning BERT-Attack

Overview

BERT-ATTACK

Code for our EMNLP2020 long paper:

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

Dependencies

Python 3.7
PyTorch 1.4.0
transformers 2.9.0
TextFooler

Usage

To train a classification model, please use the run_glue.py script in the huggingface transformers==2.9.0.

To generate adversarial samples based on the masked-LM, run

python bertattack.py --data_path data_defense/imdb_1k.tsv --mlm_path bert-base-uncased --tgt_path models/imdbclassifier --use_sim_mat 1 --output_dir data_defense/imdb_logs.tsv --num_label 2 --use_bpe 1 --k 48 --start 0 --end 1000 --threshold_pred_score 0

--data_path: We take IMDB dataset as an example. Datasets can be obtained in TextFooler.
--mlm_path: We use BERT-base-uncased model as our target masked-LM.
--tgt_path: We follow the official fine-tuning process in transformers to fine-tune BERT as the target model.
--k 48: The threshold k is the number of possible candidates
--output_dir : The output file.
--start: --end: in case the dataset is large, we provide a script for multi-thread process.
--threshold_pred_score: a score in cutting off predictions that may not be suitable (details in Section5.1)

Note

The datasets are re-formatted to the GLUE style.

Some configs are fixed, you can manually change them.

If you need to use similar-words-filter, you need to download and process consine similarity matrix following TextFooler. We only use the filter in sentiment classification tasks like IMDB and YELP.

If you need to evaluate the USE-results, you need to create the corresponding tensorflow environment USE.

For faster generation, you could turn off the BPE substitution.

As illustrated in the paper, we set thresholds to balance between the attack success rate and USE similarity score.

The multi-thread process use the batchrun.py script

You can run

cat cmd.txt | python batchrun.py --gpus 0,1,2,3

to simutaneously generate adversarial samples of the given dataset for faster generation. We use the IMDB dataset as an example.

Code for EMNLP2020 long paper: BERT-Attack: Adversarial Attack Against BERT Using BERT

Related tags

Overview

BERT-ATTACK

Dependencies

Usage

Note

Owner

Linyang Li

Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch .

Rayvens makes it possible for data scientists to access hundreds of data services within Ray with little effort.

Open CV - Convert a picture to look like a cartoon sketch in python

Kohei's 5th place solution for xview3 challenge

PyTorch implementation of image classification models for CIFAR-10/CIFAR-100/MNIST/FashionMNIST/Kuzushiji-MNIST/ImageNet

Awesome Monocular 3D detection

This repository contains the official code of the paper Equivariant Subgraph Aggregation Networks (ICLR 2022)

Official Keras Implementation for UNet++ in IEEE Transactions on Medical Imaging and DLMIA 2018

4K videos with annotated masks in our ICCV2021 paper 'Internal Video Inpainting by Implicit Long-range Propagation'.

Implementation of SE3-Transformers for Equivariant Self-Attention, in Pytorch.

CvT-ASSD: Convolutional vision-Transformerbased Attentive Single Shot MultiBox Detector (ICTAI 2021 CCF-C 会议)The 33rd IEEE International Conference on Tools with Artificial Intelligence

Code for "Learning Graph Cellular Automata"

PyTorch implementation of an end-to-end Handwritten Text Recognition (HTR) system based on attention encoder-decoder networks

Weakly-supervised object detection.

The 2nd place solution of 2021 google landmark retrieval on kaggle.

[TIP 2021] SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction

Drone Task1 - Drone Task1 With Python

Learning to Identify Top Elo Ratings with A Dueling Bandits Approach

Official implementation of DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations in TensorFlow 2

Little Ball of Fur - A graph sampling extension library for NetworKit and NetworkX (CIKM 2020)