The InterScript dataset contains interactive user feedback on scripts generated by a T5-XXL model.

Last update: Dec 01, 2022

Related tags

Deep Learning interscript

Overview

Interscript

The Interscript dataset contains interactive user feedback on a T5-11B model generated scripts.

Dataset

data.json contains the data in an easy to read JSON format. data.jsonl contains the data in a JSONL format. The file contains 8466 samples, one sample per line. Every sample is a JSON object with the following fields:

 {
        "input_script": "push chair in -> pull chair in; pull chair in -> push chair against wall; push chair against wall -> straighten chair legs; straighten chair legs -> Push all chairs in; line up the chairs -> push chair in",
        "input_feedback": "One would not pull chair in if they had initially pushed it in.",
        "output_script": "push chair against wall -> straighten chair legs;straighten chair legs -> Push all chairs in;line up the chairs -> push chair in;push chair in -> push chair against wall",
        "metadata": {
            "id": "301KG0KX9BKTC0HB7Z9SV1Y5HAFH2Y.2_implicit.gp",
            "goal": "push all chairs in",
            "is_distractor": false,
            "feedback_type": "implicit.gp",
            "edit": "Remove node 'pull chair in'",
            "input_script_formatted": [
                "1. line up the chairs",
                "2. push chair in",
                "3. pull chair in",
                "4. push chair against wall",
                "5. straighten chair legs",
                "6. Push all chairs in"
            ],
            "output_script_formatted": [
                "1. line up the chairs",
                "2. push chair in",
                "3. push chair against wall",
                "4. straighten chair legs",
                "5. Push all chairs in"
            ]
        }
    }

The description of the fields is as follows:

input_script: Model generated script $y_{bad}$.
input_feedback: User feedback on the input script $f$.
output_script: Fixed output script $y_{good}$.

Metadata contains additional information about the sample. Some important fields are:

id: Unique identifier of the sample.
goal: Goal of the script.
is_distractor: Whether the feedback is a distractor (please see Section 4 for more details).
feedback_type: Type of feedback (please see Section 4 "Annotation" for more details).
edit: The input_feedback presented as an edit operation on the input script, that is, the edit operation that transforms the input script into the output script.
input_script_formatted: The input script presented as a list of sentences.
output_script_formatted: The output script presented as a list of sentences.

Data collection process

We use Amazon Mechanical Turk to collect feedback on erroneous scripts from users.
An overview of the process is captured in the following figure:

Amazon Mechanical Turk Template

turk_template.html contains the template for Amazon Mechanical Turk HITs.

The InterScript dataset contains interactive user feedback on scripts generated by a T5-XXL model.

Related tags

Overview

Interscript

Dataset

Data collection process

Amazon Mechanical Turk Template

Owner

AI2

QQ Browser 2021 AI Algorithm Competition Track 1 1st Place Program

NovelD: A Simple yet Effective Exploration Criterion

A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Paper: Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification

👨‍💻 run nanosaur in simulation with Gazebo/Ingnition

Urban mobility simulations with Python3, RLlib (Deep Reinforcement Learning) and Mesa (Agent-based modeling)

A program that can analyze videos according to the weights you select

MogFace: Towards a Deeper Appreciation on Face Detection

Project page for the paper Semi-Supervised Raw-to-Raw Mapping 2021.

ANN model for prediction a spatio-temporal distribution of supercooled liquid in mixed-phase clouds using Doppler cloud radar spectra.

This solves the autonomous driving issue which is supported by deep learning technology. Given a video, it splits into images and predicts the angle of turning for each frame.

Neural implicit reconstruction experiments for the Vector Neuron paper

Source code for Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

An OpenAI-Gym Package for Training and Testing Reinforcement Learning algorithms with OpenSim Models

Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks

Code for reproducing experiments in "Improved Training of Wasserstein GANs"

4st place solution for the PBVS 2022 Multi-modal Aerial View Object Classification Challenge - Track 1 (SAR) at PBVS2022

A library of extension and helper modules for Python's data analysis and machine learning libraries.

PyTorch Implementation of the paper Learning to Reweight Examples for Robust Deep Learning