This project uses reinforcement learning on stock market and agent tries to learn trading. The goal is to check if the agent can learn to read tape. The project is dedicated to hero in life great Jesse Livermore.

Overview

Reinforcement-trading

This project uses Reinforcement learning on stock market and agent tries to learn trading. The goal is to check if the agent can learn to read tape. The project is dedicated to hero in life great Jesse Livermore and one of the best human i know Ryan Booth https://github.com/ryanabooth.

One Point to note, the code inside tensor-reinforcement is the latest code and you should be reading/running if you are interested in project. Leave other directories, I am not working on them for now
. To read my thought journal during ongoing development https://github.com/deependersingla/deep_trader/blob/master/deep_thoughts.md

Before this I have used RL here: http://somedeepthoughtsblog.tumblr.com/post/134793589864/maths-versus-computation

Now I run a company on RL trading, so I can't answer questions related to the project.

Steps to reproduce DQN

a) cd tensor-reinforcement
b) Copy data from https://drive.google.com/file/d/0B6ZrYxEMNGR-MEd5Ti0tTEJjMTQ/view and https://drive.google.com/file/d/0B6ZrYxEMNGR-Q0YwWWVpVnJ3YmM/view?usp=sharing into tensor-reinforcement directory.
b) Create a directory saved_networks inside tensor_reinforcement for saving networks.
c) python dqn_model.py

Steps to reproduce PG

a) cd tensor-reinforcement
b) Create a directory saved_networks inside tensor_reinforcement for saving networks.
c) python pg_model.py

For the first iteration of the project

Process:
Intially I started by using Chainer for the project for both supervised and reinforcement learning. In middle of it AlphaGo (https://research.googleblog.com/2016/01/alphago-mastering-ancient-game-of-go.html) came because of it I shifted to read Sutton book on RL (https://webdocs.cs.ualberta.ca/~sutton/book/the-book.html), AlphaGo and related papers, David Silver lectures (http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html, they are great).

I am coming back to project after some time a lot has changed. All the cool kids even DeepMind (the gods) have started using TensorFlow. Hence, I am ditching Chainer and will use Tensorflow from now. Exciting times ahead.

Policy network

I will be starting with simple feed-forward network. Though, I am also inclined to use convolutional network reason, they do very well when the minor change in input should not change ouput. For example: In image recognizition, a small pixel values change doesn't meam image is changed. Intutively stocks numbers look same to me, a small change should not trigger a trade but again the problem here comes with normalization. With normalization the big change in number will be reduced to a very small in inputs hence its good to start with feed-forward.

Feed-forward

I want to start with 2 layer first, yes that just vanilla but lets see how it works than will shift to more deeper network. On output side I will be using a sigmoid non-linear function to get value out of 0 and 1. In hidden layer all neurons will be RELU. With 2 layers, I am assuming that first layer w1 can decide whether market is bullish, bearish and stable. 2nd layer can then decide what action to take based on based layer.

Training

I will run x episode of training and each will have y time interval on it. Policy network will have to make x*y times decision of whether to hold, buy or short. After this based on our reward I will label every decison whether it was good/bad and update network. I will again run x episode on the improved network and will keep doing it. Like MCTS where things average out to optimality our policy also will start making more positive decision and less negative decision even though in training we will see policy making some wrong choices but on average it will work out because we will do same thing million times.

Episodic

I plan to start with episodic training rather than continous training. The major reason for this is that I will not have to calculate reward after every action which agent will make which is complex to do in trading, I can just make terminal reward based on portfolio value after an entire episode (final value of portfolio - transaction cost occur inside the episode - initial value of portfolio). The other reason for doing it that I believe it will motivate agent to learn trading on episodes, which decreases risk of any outlier events or sentiment change in market.

This also means that I have to check the hypothesis on:
a) Episodes of different length
b) On different rewards terminal reward or rewards after each step inside an episode also.
As usual like every AI projects, there will be a lot of hit and trial. I should better write good code and store all results properly so that I can compare them to see what works and what don't. Ofcourse the idea is to make sure agent remain profitable while trading.

More info here https://docs.google.com/document/d/12TmodyT4vZBViEbWXkUIgRW_qmL1rTW00GxSMqYGNHU/edit

Data sources

  1. For directly running this repo, use this data source and you are all setup: https://drive.google.com/open?id=0B6ZrYxEMNGR-MEd5Ti0tTEJjMTQ
  2. Nifty Data: https://drive.google.com/folderview?id=0B8e3dtbFwQWUZ1I5dklCMmE5M2M&ddrp=1%20%E2%81%A0%E2%81%A0%E2%81%A0%E2%81%A09:05%20PM%E2%81%A0%E2%81%A0%E2%81%A0%E2%81%A0%E2%81%A0
  3. Nifty futures:http://www.4shared.com/folder/Fv9Jm0bS/NSE_Futures
  4. Google finance
  5. Interative Brokers, I used IB because I have an account with them.

For reading on getting data using IB https://www.interactivebrokers.com/en/software/api/apiguide/tables/historical_data_limitations.htm https://www.interactivebrokers.com/en/software/api/apiguide/java/historicaldata.htm symbol: stock -> STK, Indices -> IND

Reinforcement learning resources

https://github.com/aikorea/awesome-rl , this is enough if you are serious

Owner
Deepender Singla
Works at @niveshi. Before @accredible. Simple and nice guy.
Deepender Singla
Simple sinc interpolation in PyTorch.

Kazane: simple sinc interpolation for 1D signal in PyTorch Kazane utilize FFT based convolution to provide fast sinc interpolation for 1D signal when

Chin-Yun Yu 10 May 03, 2022
The Codebase for Causal Distillation for Language Models.

Causal Distillation for Language Models Zhengxuan Wu*,Atticus Geiger*, Josh Rozner, Elisa Kreiss, Hanson Lu, Thomas Icard, Christopher Potts, Noah D.

Zen 20 Dec 31, 2022
Code for NAACL 2021 full paper "Efficient Attentions for Long Document Summarization"

LongDocSum Code for NAACL 2021 paper "Efficient Attentions for Long Document Summarization" This repository contains data and models needed to reprodu

56 Jan 02, 2023
This package contains a PyTorch Implementation of IB-GAN of the submitted paper in AAAI 2021

The PyTorch implementation of IB-GAN model of AAAI 2021 This package contains a PyTorch implementation of IB-GAN presented in the submitted paper (IB-

Insu Jeon 9 Mar 30, 2022
Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations Code repo for paper Trans-Encoder: Unsupervised sentence-pa

Amazon 101 Dec 29, 2022
Deep Learning for Human Part Discovery in Images - Chainer implementation

Deep Learning for Human Part Discovery in Images - Chainer implementation NOTE: This is not official implementation. Original paper is Deep Learning f

Shintaro Shiba 63 Sep 25, 2022
Auto HMM: Automatic Discrete and Continous HMM including Model selection

Auto HMM: Automatic Discrete and Continous HMM including Model selection

Chess_champion 29 Dec 07, 2022
PyTorch implementation of paper "IBRNet: Learning Multi-View Image-Based Rendering", CVPR 2021.

IBRNet: Learning Multi-View Image-Based Rendering PyTorch implementation of paper "IBRNet: Learning Multi-View Image-Based Rendering", CVPR 2021. IBRN

Google Interns 371 Jan 03, 2023
YOLOX_AUDIO is an audio event detection model based on YOLOX

YOLOX_AUDIO is an audio event detection model based on YOLOX, an anchor-free version of YOLO. This repo is an implementated by PyTorch. Main goal of YOLOX_AUDIO is to detect and classify pre-defined

intflow Inc. 77 Dec 19, 2022
A small library for doing fluid simulation with neural networks.

Neural Fluid Fields This is a small library for doing fluid simulation with neural fields. Check out our review paper, Neural Fields in Visual Computi

Towaki 23 Jun 23, 2022
Hypernetwork-Ensemble Learning of Segmentation Probability for Medical Image Segmentation with Ambiguous Labels

Hypernet-Ensemble Learning of Segmentation Probability for Medical Image Segmentation with Ambiguous Labels The implementation of Hypernet-Ensemble Le

Sungmin Hong 6 Jul 18, 2022
Tutorials and implementations for "Self-normalizing networks"

Self-Normalizing Networks Tutorials and implementations for "Self-normalizing networks"(SNNs) as suggested by Klambauer et al. (arXiv pre-print). Vers

Institute of Bioinformatics, Johannes Kepler University Linz 1.6k Jan 07, 2023
Bayes-Newton—A Gaussian process library in JAX, with a unifying view of approximate Bayesian inference as variants of Newton's algorithm.

Bayes-Newton Bayes-Newton is a library for approximate inference in Gaussian processes (GPs) in JAX (with objax), built and actively maintained by Wil

AaltoML 165 Nov 27, 2022
Bayesian Inference Tools in Python

BayesPy Bayesian Inference Tools in Python Our goal is, given the discrete outcomes of events, estimate the distribution of categories. Using gradient

Max Sklar 99 Dec 14, 2022
MM1 and MMC Queue Simulation using python - Results and parameters in excel and csv files

implementation of MM1 and MMC Queue on randomly generated data and evaluate simulation results then compare with analytical results and draw a plot curve for them, simulate some integrals and compare

Mohamadreza Rezaei 1 Jan 19, 2022
SAS output to EXCEL converter for Cornell/MIT Language and acquisition lab

CORNELLSASLAB SAS output to EXCEL converter for Cornell/MIT Language and acquisition lab Instructions: This python code can be used to convert SAS out

2 Jan 26, 2022
Regularizing Nighttime Weirdness: Efficient Self-supervised Monocular Depth Estimation in the Dark (ICCV 2021)

Regularizing Nighttime Weirdness: Efficient Self-supervised Monocular Depth Estimation in the Dark (ICCV 2021) Kun Wang, Zhenyu Zhang, Zhiqiang Yan, X

kunwang 66 Nov 24, 2022
Complex Answer Generation For Conversational Search Systems.

Complex Answer Generation For Conversational Search Systems. Code for Does Structure Matter? Leveraging Data-to-Text Generation for Answering Complex

Hanane Djeddal 0 Dec 06, 2021
NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework

NLP From Scratch Without Large-Scale Pretraining This repository contains the code, pre-trained model checkpoints and curated datasets for our paper:

Xingcheng Yao 224 Dec 08, 2022
Convnext-tf - Unofficial tensorflow keras implementation of ConvNeXt

ConvNeXt Tensorflow This is unofficial tensorflow keras implementation of ConvNe

29 Oct 06, 2022