Saturday 29 February 2020

Summary 29th Feburary 2020

Where are we?

We have been working on a project to try to forecast the stock market. The purpose of this project has been clear: To make a profit using deep machine learning on the financial market.  

How can you achieve this?

There are three crucial factors to accomplish this goal. 

  1. The quantity and quality of the training data
  2. The sophistication and fitness of Layers in the Sequential model
  3.  How we operate AIs 

The quantity and quality of the training data

To reach a better outcome, we need to avoid overfitting. There are two ways to fulfil the intention. The first one is to increase the amount of training data. The other is to add a dropout layer to the Sequential model. It is sometimes challenging to increment the volume of training data, however. 

The easiest way is to add various price data to the training data to increase its amount. However, the Japanese stock market information mingled with the American stock market data in the training data might confuse AI during the learning process. Also, I confirmed that it does not make a drastically better result either. 

On the other hand, specializing the attribute of the dataset by narrowing down the types of information in it makes the amount of data not enough to prevent overfitting. Is there a measure to boost the volume of the training dataset while avoiding excessive diversification in it? 

Well, I found one. 

As shown in the figure above, formerly, we just divided the data and did not allow the overlap between one and the other. It, however, turned out that the former method was not the best way. The latter approach lets us extract more training data from the same original data than the former. Note that "Data" abstractly shown by the image above is like, for example, the image data below.

There is another excellent feature of this mechanism. Although to improve the quality of the training dataset, eliminating the static data in terms of the price fluctuation might be an acceptable option, it reduces the volume of the training data. This procedure, however, makes up for this shortcoming by multiplying the quantity of the training dataset. 
Henceforward, the goal is to exploit the maximum potential of the data by increasing the amount of training dataset using the system above while standardizing the data and appropriately removing specific data that is not considered necessary for the learning process. 

The sophistication and fitness of Layers in the Sequential model

Although it is effortful to comprehend, this factor plays a significant role in composing an exceptional AI model. Therefore, it might be imperative to learn Layers in Keras. 

I was utilizing a sequential model from a random place without thinking at first. There are, however, certain limitations with this approach. Accordingly, I decided to learn more about the theory behind it. One essential piece of information from what I learned is the order of Layers in a Sequential model. 

- Convolution

- Relu

- Pooling

- Flatten

- Fully Connected (Dense)

- Softmax

Note that Convolution, Relu and Pooling are iterable. I have no idea why but the iteration like in the Sequential model below often generates a better result. 

Please remember that I am still unsure about the theory behind it and what is happening with this model. Therefore, the Sequential model above might be a complete mess if seen by the professionals. 

How we operate AIs

Since I have not been making efforts on this matter, this part will be a problem to be solved. 

Although I implemented an AI democratic decision-making system to make decisions made by AIs more reliable and accurate, the problem is the timing when to use it and how we reflect the results in buying and selling actions. 

Conveniently or not, the Japanese stock market crashed a few days ago. Let us show those three AIs the latest price chart and see what happens. 

Two-thirds of AIs said it would go up. I wonder What actually will happen next week. 

Tuesday 4 February 2020

Progressing yet.

I have been working on the so-called “Democratic approach project”. Under the project, I attempt to make a democratic AI system where AIs make decisions based on their voting outcome.
I started to make multiple models or AIs.
First AI I made was accidentally favourable. I call it Liselotte.
A problem, however, arose when I tried to create the second AI called Alma.
Issues that occurred during the process are below.
1.    Overfitting
2.    Outputs of 0 are too frequent (70%~100% of whole predictions it generates)

When I try to fix #1, the overfitting problem, by adding a Normalization Layer such as dropout, problem #2 appears and vice versa.

There are mainly three variables in the equation to solve this dilemma: 1. Amount and intensity of Normalization Layers. 2. Epochs. and 3. The chosen optimizer.

Other factors influence the results besides the training phase: 1. Data arrangement. And 2. Model or layer architecture (slightly overlapping with the three variables above)
Occasionally, I find a better combination of them and get the acceptable result as a model, in other words. I get another AI joining the AI democratic congress.

I have so far made three AIs called Liselotte, Alma and Vanessa. Alma and Vanessa seem much superior to Liselotte. Therefore, I plan to exile Liselotte from the assembly, but it is still under consideration.

Market Prediction with Artificial Intelligence. Demonstration.

          We finally managed to actualise what We wanted to make.            We developed A model that can predict the stock market and auto...