Safe Policy Optimization with Local Features

Overview

Safe Policy Optimization with Local Feature (SPO-LF)

This is the source-code for implementing the algorithms in the paper "Safe Policy Optimization with Local Generalized Linear Function Approximations" which was presented in NeurIPS-21.

Installation

There is requirements.txt in this repository. Except for the common modules (e.g., numpy, scipy), our source code depends on the following modules.

We also provide Dockerfile in this repository, which can be used for reproducing our grid-world experiment.

Simulation configuration

We manage the simulation configuration using hydra. Configurations are listed in config.yaml. For example, the algorithm to run should be chosen from the ones we implemented:

sim_type: {safe_glm, unsafe_glm, random, oracle, safe_gp_state, safe_gp_feature, safe_glm_stepwise}

Grid World Experiment

The source code necessary for our grid-world experiment is contained in /grid_world folder. To run the simulation, for example, use the following commands.

cd grid_world
python main.py sim_type=safe_glm env.reuse_env=False

For the monte carlo simulation while comparing our proposed method with baselines, use the shell file, run.sh.

We also provide a script for visualization. If you want to render how the agent behaves, use the following command.

python main.py sim_type=safe_glm env.reuse_env=True

Safety-Gym Experiment

The source code necessary for our safety-gym experiment is contained in /safety_gym_discrete folder. Our experiment is based on safety-gym. Our proposed method utilize dynamic programming algorithms to solve Bellman Equation, so we modified engine.py to discrtize the environment. We attach modified safety-gym source code in /safety_gym_discrete/engine.py. To use the modified library, please clone safety-gym, then replace safety-gym/safety_gym/envs/engine.py using /safety_gym_discrete/engine.py in our repo. Using the following commands to install the modified library:

cd safety_gym
pip install -e .

Note that MuJoCo licence is needed for installing Safety-Gym. To run the simulation, use the folowing commands.

cd safety_gym_discrete
python main.py sim_idx=0

We compare our proposed method with three notable baselines: CPO, PPO-Lagrangian, and TRPO-Lagrangian. The baseline implementation depends on safety-starter-agents. We modified run_agent.py in the repo source code.

To run the baseline, use the folowing commands.

cd safety_gym_discrete/baseline
python baseline_run.py sim_type=cpo

The environment that agent runs on is generated using generate_env.py. We provide 10 50*50 environments. If you want to generate other environments, you can change the world shape in safety_gym_discrete.py, and running the following commands:

cd safety_gym_discrete
python generate_env.py

Citation

If you find this code useful in your research, please consider citing:

@inproceedings{wachi_yue_sui_neurips2021,
  Author = {Wachi, Akifumi and Wei, Yunyue and Sui, Yanan},
  Title = {Safe Policy Optimization with Local Generalized Linear Function Approximations},
  Booktitle  = {Neural Information Processing Systems (NeurIPS)},
  Year = {2021}
}
Owner
Akifumi Wachi
Akifumi Wachi
A python module for retrieving and parsing WHOIS data

pythonwhois A WHOIS retrieval and parsing library for Python. Dependencies None! All you need is the Python standard library. Instructions The manual

Sven Slootweg 384 Dec 23, 2022
This repo created for bypassing Widevine L3 DRM and obtaining keys.

First run: Copy headers (with cookies) of POST license request from browser to headers.py like dictionary. pip install -r requirements.txt # if doesn'

Mikhail 263 Jan 07, 2023
Trainspotting - Python Dependency Injector based on interface binding

Choose dependency injection Friendly with MyPy Supports lazy injections Supports

avito.tech 3 Jan 26, 2022
TightVNC Vulnerability.

CVE-2022-23967 In TightVNC 1.3.10, there is an integer signedness error and resultant heap-based buffer overflow in InitialiseRFBConnection in rfbprot

MaherAzzouzi 15 Jul 11, 2022
A Static Analysis Tool for Detecting Security Vulnerabilities in Python Web Applications

This project is no longer maintained March 2020 Update: Please go see the amazing Pysa tutorial that should get you up to speed finding security vulne

2.1k Dec 25, 2022
An advanced multi-threaded, multi-client python reverse shell for hacking linux systems

PwnLnX An advanced multi-threaded, multi-client python reverse shell for hacking linux systems. There's still more work to do so feel free to help out

0xTRAW 212 Dec 24, 2022
Log4jScanner is a Log4j Related CVEs Scanner, Designed to Help Penetration Testers to Perform Black Box Testing on given subdomains.

Log4jScanner Log4jScanner is a Log4j Related CVEs Scanner, Designed to Help Penetration Testers to Perform Black Box Testing on given subdomains. Disc

Pushpender Singh 35 Dec 12, 2022
Script to calculate Active Directory Kerberos keys (AES256 and AES128) for an account, using its plaintext password

Script to calculate Active Directory Kerberos keys (AES256 and AES128) for an account, using its plaintext password

Matt Creel 27 Dec 20, 2022
Linus-png.github.io - Versionsverwaltung & Open Source Hausaufgabe

Let's Git - Versionsverwaltung & Open Source Hausaufgabe Herzlich Willkommen zu

1 Jan 24, 2022
A repository to detect the ARP spoofing in any devices and prevent Man in the Middle(MITM) attack using Python3

arp_spoof_detector A repository to detect the ARP spoofing in any devices and prevent Man in the Middle(MITM) attack using Python3 Usage: git clone ht

Surya Das N 1 Oct 30, 2021
A tool to extract the IdP cert from vCenter backups and log in as Administrator

vCenter SAML Login Tool A tool to extract the Identity Provider (IdP) cert from vCenter backups and log in as Administrator Background Commonly, durin

Horizon 3 AI Inc 343 Dec 31, 2022
Android Malware (Analysis | Scoring) System

An Obfuscation-Neglect Android Malware Scoring System Quark-Engine is also bundled with Kali Linux, BlackArch. A trust-worthy, practical tool that's r

Quark-Engine 1k Jan 04, 2023
DNSSEQ: PowerDNS with FALCON Signature Scheme

PowerDNS-based proof-of-concept implementation of DNSSEC using the post-quantum FALCON signature scheme.

Nils Wisiol 4 Feb 03, 2022
Python exploit code for CVE-2021-4034 (pwnkit)

Python3 code to exploit CVE-2021-4034 (PWNKIT). This was an exercise in "can I make this work in Python?", and not meant as a robust exploit. It Works

Joe Ammond 92 Dec 29, 2022
A Python script that can be used to check if a SAP system is affected by CVE-2022-22536

Vulnerability assessment for CVE-2022-22536 This repository contains a Python script that can be used to check if a SAP system is affected by CVE-2022

Onapsis Inc. 42 Dec 01, 2022
DCSync - DCSync Attack from Outside using Impacket

Adding DCSync Permissions Mostly copypasta from https://github.com/tothi/rbcd-at

n00py 77 Dec 16, 2022
Fast subdomain scanner, Takes arguments from a Json file ("args.json") and outputs the subdomains.

Fast subdomain scanner, Takes arguments from a Json file ("args.json") and outputs the subdomains. File Structure core/ colors.py db/ wordlist.txt REA

whoami security 4 Jul 02, 2022
Kriecher is a simple Web Scanner which will run it's own checks for the OWASP

Kriecher is a simple Web Scanner which will run it's own checks for the OWASP top 10 https://owasp.org/www-project-top-ten/# as well as run a

1 Nov 12, 2021
Log4Shell RCE Exploit - fully independent exploit does not require any 3rd party binaries.

Log4Shell RCE Exploit fully independent exploit does not require any 3rd party binaries. The exploit spraying the payload to all possible logged HTTP

258 Jan 02, 2023
RedTeam-Security - In this repo you will get the information of Red Team Security related links

OSINT Passive Discovery Amass - https://github.com/OWASP/Amass (Attack Surface M

Abhinav Pathak 5 May 18, 2022