RuleBERT: Teaching Soft Rules to Pre-Trained Language Models

Last update: Aug 24, 2022

Related tags

Overview

RuleBERT: Teaching Soft Rules to Pre-Trained Language Models

RuleBERT is a pre-trained language model that has been fine-tuned on soft logical results. This repo contains the required code for running the experiments of the associated paper.

Installation

0. Clone Repo

git clone https://github.com/MhmdSaiid/RuleBert
cd RuleBERT

1. Create virtual env and install reqs

(optional) virtualenv -m python RuleBERT
pip install -r requirements.txt

2. Download Data

The datasets can be found here. (DISCLAIMER: ~25 GB on disk)

You can also run:

bash download_datasets.sh

Run Experiments

When an experiemnt is complete, the model, the tokenizer, and the results are stored in models/**timestamp**.

i) Single Rules

bash experiments/single_rules/SR.sh data/single_rules

ii) Rule Union Experiment

bash experiments/union_rules/UR.sh data/union_rules

iii) Rule Chain Experiment

bash experiments/chain_rules/CR.sh data/chain_rules

iv) External Datasets

Generate Your Own Data

You can generate your own data for a single rule, a union of rules sharing the same rule head, or a chain of rules.

First, make sure you are in the correct directory.

cd data_generation

1) Single Rule

There are two ways to data for a single rule:

i) Pass Data through Arguments

python DataGeneration.py 
       --rule 'spouse(A,B) :- child(A,B).' 
       --pool_list "[['Anne', 'Bob', 'Charlie'],
                    ['Frank', 'Gary', 'Paul']]" 
       --rule_support 0.67

--rule : The rule in string format. Consult here to see how to write a rule.
--pool_list : For every variable in the rule, we include a list of possible instantiations.
--rule_support : A float representing the rule support. If not specified, rule defaults to a hard rule.
--max_num_facts : Maximum number of facts in a generated theory.
--num : Total number of theories per generated (rule,facts).
--TWL : When called, we use three-way-logic instead of negation as failure. Unsatisifed predicates are no longer considered False.
--complementary_rules : A string of complementary rules to add.
--p_bar : Boolean to show a progress bar. Deafults to True.

ii) Pass a JSON file

This is more convenient for when rules are long or when there are multiple rules. The JSON file specifies the rule(s), pool list(s), and rule support(s). It is passed as an argument.

python DataGeneration.py --rule_json r1.jsonl

2) Union of Rules

For a union of rules sharing the same rule-head predicate, we pass a JSON file to the command that contaains rules with overlapping rule-head predicates.

python DataGeneration.py --rule_json Multi_rule.json 
                         --type union

--type is used to indicate which type of data generation method should be set to. For a union of rules, we use --type union. If --type single is used, we do single-rule data generation for each rule in the file.

3) Chained Rules

For a chain of rules, the json file should include rules that could be chained together.

python DataGeneration.py --rule_json chain_rules.json 
                         --type chain

The chain depth defaults to 5 --chain_depth 5.

Train your Own Model

To fine-tune the model, run:

# train
python trainer.py --data-dir data/R1/
                  --epochs 3
                  --verbose

When complete, the model and tokenizer are saved in models/**timestamp**.

To test the model, run:

# test
python tester.py --test_data_dir data/test_R1/
                 --model_dir models/**timestamp**
                 --verbose

A JSON file will be saved in model_dir containing the results.

Contact Us

For any inquiries, feel free to contact us, or raise an issue on Github.

Reference

You can cite our work:

@inproceedings{saeed-etal-2021-rulebert,
    title = "{R}ule{BERT}: Teaching Soft Rules to Pre-Trained Language Models",
    author = "Saeed, Mohammed  and
      Ahmadi, Naser  and
      Nakov, Preslav  and
      Papotti, Paolo",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.110",
    pages = "1460--1476",
    abstract = "While pre-trained language models (PLMs) are the go-to solution to tackle many natural language processing problems, they are still very limited in their ability to capture and to use common-sense knowledge. In fact, even if information is available in the form of approximate (soft) logical rules, it is not clear how to transfer it to a PLM in order to improve its performance for deductive reasoning tasks. Here, we aim to bridge this gap by teaching PLMs how to reason with soft Horn rules. We introduce a classification task where, given facts and soft rules, the PLM should return a prediction with a probability for a given hypothesis. We release the first dataset for this task, and we propose a revised loss function that enables the PLM to learn how to predict precise probabilities for the task. Our evaluation results show that the resulting fine-tuned models achieve very high performance, even on logical rules that were unseen at training. Moreover, we demonstrate that logical notions expressed by the rules are transferred to the fine-tuned model, yielding state-of-the-art results on external datasets.",
}

License

MIT

RuleBERT: Teaching Soft Rules to Pre-Trained Language Models

Related tags

Overview

RuleBERT: Teaching Soft Rules to Pre-Trained Language Models

Installation

0. Clone Repo

1. Create virtual env and install reqs

2. Download Data

Run Experiments

i) Single Rules

ii) Rule Union Experiment

iii) Rule Chain Experiment

iv) External Datasets

Generate Your Own Data

1) Single Rule

i) Pass Data through Arguments

ii) Pass a JSON file

2) Union of Rules

3) Chained Rules

Train your Own Model

Contact Us

Reference

License

Owner

This is the official pytorch implementation of Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation(TESKD)

Framework for evaluating ANNS algorithms on billion scale datasets.

project page for VinVL

Official Pytorch implementation of C3-GAN

Some methods for comparing network representations in deep learning and neuroscience.

Implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"

Source code of the paper "Deep Learning of Latent Variable Models for Industrial Process Monitoring".

The coda and data for "Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach" (ACL '21)

A multi-mode modulator for multi-domain few-shot classification (ICCV)

Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks

Agent-based model simulator for air quality and pandemic risk assessment in architectural spaces

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

Segmentation models with pretrained backbones. PyTorch.

Overview of architecture and implementation of TEDS-Net, as described in MICCAI 2021: "TEDS-Net: Enforcing Diffeomorphisms in Spatial Transformers to Guarantee TopologyPreservation in Segmentations"

Code for the paper "Adversarially Regularized Autoencoders (ICML 2018)" by Zhao, Kim, Zhang, Rush and LeCun

Contrastive Learning of Structured World Models

Official PyTorch implementation of the Fishr regularization for out-of-distribution generalization

Code for CVPR2021 paper 'Where and What? Examining Interpretable Disentangled Representations'.

Code for DisCo: Remedy Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning

A collection of loss functions for medical image segmentation