Assuming you followed the instructions from my last article on how to automatically extract fundamental data, you would now have a tool to automatically extract the historical fundamental data on companies you are interested in. In a short span of time, you will be in possession of a ton of data. If you ask data scientists, they will tell you that on any data project, obtaining data is 90% of the work, so let me congratulate you on getting 90% there! Unfortunately, it is unlikely that you are a data scientist, so how do you actually complete the final 10% to get value from the data you have collected?
Well, the possibilities are limitless, since the data is on your local computer and not locked away behind some websites' paywall. Let me provide a few examples of what you can do.
Well, the possibilities are limitless, since the data is on your local computer and not locked away behind some websites' paywall. Let me provide a few examples of what you can do.
Create your own stock screener
Want to identify what stocks meet your criteria for investment? Investors would typically turn to online stock screeners. There are many good and free stock screeners out there, like finviz.com. However, the extent of the scanning is limited to the parameters available on the website. For example, finviz.com lets you scan based on the latest data of the companies. But as we hypothesized previously, since all fundamental data is historical, we should not just be looking at the latest data. We should be attempting to look at the trend of the data of each company, so that we can have confidence that the company will continue to do similarly into the future.
The first thing you need to do is get a large set of companies' ticker symbols. For the US market, most would typically start with the S&P 500 index and NASDAQ 100 components. This will give you roughly 500+ of the largest companies in the US. You can find the list of S&P 500 index components from this Wikipedia article and the list of NASDAQ 100 components from NASDAQ's website.
The first thing you need to do is get a large set of companies' ticker symbols. For the US market, most would typically start with the S&P 500 index and NASDAQ 100 components. This will give you roughly 500+ of the largest companies in the US. You can find the list of S&P 500 index components from this Wikipedia article and the list of NASDAQ 100 components from NASDAQ's website.
Once you have your data, a simple way of analyzing the trend of the companies is to analyze the linear trend of each of the parameters you would like to look at. Since the data extracted is in Excel, you can create a new column beside the data you are interested in and use the SLOPE function in Excel to calculate the linear trend. In the example below, it shows that AAPL has been increasing it's revenue consistently for the last 10 years, with an average increase of US$22,354 mil per year.
You can download this file to get at the specific formula to calculate the slope.
trend_example.xlsx  
File Size:  31 kb 
File Type:  xlsx 
Now all you need to do is copy this same formula across all the parameters you are interested in and use Excel's filter function to extract the companies that meet your criteria.
That is just one way of screening the companies, what other ways can you think of? The possibilities are only limited by your imagination....
That is just one way of screening the companies, what other ways can you think of? The possibilities are only limited by your imagination....
Determine what fundamental attributes affect price
This is the holy grail of fundamental analysis. The idea is, if you can figure out which exact fundamental attributes lead to a positive impact on price, you can use the screener you built above to identify companies that have those attributes and invest in them.
The first step is the most tedious and difficult step, to collect the required data. One key tenet of predictive analytics is to have data over roughly 3 times the amount of time as you would like to predict. This is so that you can split the data into 3 sets, the training set, the validation set and the test set. So if you would like to predict data over 3 months, you should collect at least 9 months worth of data. You should have the same stocks in all 3 data sets.
The first step is the most tedious and difficult step, to collect the required data. One key tenet of predictive analytics is to have data over roughly 3 times the amount of time as you would like to predict. This is so that you can split the data into 3 sets, the training set, the validation set and the test set. So if you would like to predict data over 3 months, you should collect at least 9 months worth of data. You should have the same stocks in all 3 data sets.
The training set is the data you use to "train" a model. If you know anything about machine learning, this is the data they feed into a computer program to adjust the parameters of the computer program to fit around the data. In layman terms, we will trying to find the best parameters in this set of data that leads to the best results in the validation set.
Once we have trained the model, the test set is used to test whether your model actually works. Basically, you apply the criteria that you found using the training set to the validation set and use these to pick the stocks in the test set. If the stocks you picked in the test set performed better than the other stocks that you did not pick, congratulations, you have found a set of criteria that will likely give you better than average returns over 3 months.
Enough theory, for our practical purposes, let's have a look at the specific steps you need to take.
Once we have trained the model, the test set is used to test whether your model actually works. Basically, you apply the criteria that you found using the training set to the validation set and use these to pick the stocks in the test set. If the stocks you picked in the test set performed better than the other stocks that you did not pick, congratulations, you have found a set of criteria that will likely give you better than average returns over 3 months.
Enough theory, for our practical purposes, let's have a look at the specific steps you need to take.
Step 
Description 
Example 
1 
Referencing the visualization of data sets above, for the same set of companies, perform 2 extracts of the historical data.

With a list of 100 companies, you extract 2 sets of data on 01 Jan 1 and 01 Apr. You also collect the stock prices of those 100 companies on 01 Jan, 01 Apr and 30 Jun. 
2 
Calculate the change in the stock prices between month 4 and month 7. Create a new column in the Excel sheet where you do this calculation and for the ones with positive change, mark these companies as 1. For the companies with negative change, mark these companies as 0. 
Calculate the difference in stock prices from 01 Jan to 01 Apr. Of the 100 companies, 70 companies have a positive difference, mark them with a 1. Mark the rest with a 0. 
3 
Now your task is to find a set of criteria in the month 4 fundamental data that lets you filter out as many companies that are marked as 1 as possible. You can try each attribute in turn, or use a calculation of attributes, like the trend calculation I showed you above. 
Try to find a set of criteria from the 01 Jan dataset that results in the most positive companies remaining and most negative companies being filtered out. Say you found 2 criteria.
So you would hope that applying this criteria in the future would result in you investing in 50/70 (71%) of positive companies and 5/30 (17%) of negative companies. 
4 
Using the same criteria you identified, apply these to the data from month 7 and filter out the stocks that fit these criteria. 
Apply the criteria you have found to the 01 Apr and determine the set of companies that you would invest in, based on the criteria you have identified. 
5 
For the stocks that fit your criteria, calculate the stock price difference between the end of month 9 and start of month 7. 
For the stocks that fit your criteria, calculate the stock price difference between 01 Apr and 30 Jun. 
6 
Calculate the average percentage price change for your whole data set. The average price change for your selected stocks should perform better than the average price change of the whole data set, otherwise, revise your criteria. 
Calculate the average percentage price change for your whole data set. Say on average for all the stocks in your list, the prices increased by 5% in that 3 month period. With your filtered list, the prices increased by 7%. This gives you confidence that the fundamental criteria you selected might have some impact on the stock prices within a 3 month period. 
Once you have identified a set of criteria, extract the latest data from morningstar and filter out the companies that meet your criteria and invest in them. Hopefully they will perform better than the market average, good luck!