The Impact of Reinforcement Learning on Financial Trading

Oboulo Int.

Order the writing of a tailor-made Finance Case study

Free quote online

Case study Format .docx

The Impact of Reinforcement Learning on Financial Trading

Download

Read an extract

Themes

Reinforcement learning, financial trading, artificial intelligence, stock exchange, algorithm, PPO Proximal Policy Optimization, A2C Advantage Actor Critic, DDPG Deep Deterministic Policy Gradient, Nissan, S&P 500

Reader
Abstract
Contents
Extract

Abstract

Reinforcement learning has established itself in recent years as an essential topic in artificial intelligence research. Like other machine learning methods, the reinforcement techniques used are not new (the Q-learning algorithm was introduced in 1989), but have been revealed in the eyes of the world thanks to emblematic advances. It is in particular thanks to a single and unique Q-learning program, combined with deep learning, that the DeepMind engineers in 2014 achieved superhuman performance in almost all Atari games, before beating a Go game legend two years later.

Short history
Reinforcement Learning Process
Contextual Example of Reinforcement Learning
Reinforcement Learning in Trading
Example in Real-life Trading
1. Japan (Nissan)
2. USA (S&P 500)

Get this table of contents for free after login.

Extract

[...] The Impact of Reinforcement Learning on Financial Trading 1.4.2 Short history Reinforcement learning has established itself in recent years as an essential topic in artificial intelligence research. Like other machine learning methods, the reinforcement techniques used are not new (the Q-learning algorithm was introduced in 1989), but have been revealed in the eyes of the world thanks to emblematic advances. It is in particular thanks to a single and unique Q-learning program, combined with deep learning, that the DeepMind engineers in 2014 achieved superhuman performance in almost all Atari games, before beating a Go game legend two years later Reinforcement learning process: This technique is based on the assumption that within a system, it is possible to prepare: logic mechanism A capable of choosing the outputs as a function of the inputs received. [...]

[...] The second is the state of the "environment" for stock trading. Intuitively, it is the stock price (stock price) that changes over time, but it is not comprehensive enough because there are stocks in the market. Usually, correct decision making often requires complete information rather than one-sidedness, but for the sake of simplicity, in our example we will be using indicators other than stock prices such as MACD, RSI, CCI, ADX, etc. Then there is the profit that is most important to the trader, and this is also the goal of the algorithm to maximize. [...]

[...] Here we take the example of how we play Super Mario we will first do a decomposition action: 1. Our eyes observe the interface of the game: including the location of Mario, the location of obstacles, etc This image information is transmitted to our brain, then we make a decision: we select a plan from the possible operations of down, left and right" 3. We then carry out this operation by hand, then the game interface refreshes, we see Mario moving forward or getting a gold coin or accidentally falling into the pit 4. [...]

[...] The player uses reinforcement learning, just like the man-machine in the super Mario game as seen. The algorithm achieves the evolution of its own strategy through interaction with the environment. In our case the process uses three classical reinforcement learning algorithms. A2C, DDPG, PPO. Other Details. Considering the various complex factors that can actually occur, our example has some assumptions, such as the assumption that the algorithm will not affect the stock market fundamentally, the assumption that the money earned will not be lost, And given special circumstances, like judging whether it will collapse based on indicators, if it does, the trader will stop buying and selling all the stocks. [...]

[...] Since it is assumed that you will sell it the day you buy it and make a profit, you have eliminated the selling behavior and made learning easier. The rest is a reward. Of course, the reward pays off. It will pay off if you make the closing price minus the opening price on the day you bought it. Since deep reinforcement learning involves random processing, it may not always be possible to reproduce the exact same response. The calculation was repeated 100 times to verify accuracy. [...]

docx