Subscribe to RSS feed
posted on 23 Feb 2016  -  33,380 views
Like many others, I enjoy buying stocks that pay good dividends.
The challenge is in identifying companies that will be able to pay good dividends year after year.
Many might suggest reading through years of financial statements, as well as getting to know the top management and their vision for the company as it is the ideal way of projecting its future cashflow and thus, dividends.
I do agree that that might be a good way, but the
statistician part of me prefers to view it as a statistical game (because looking forward, I believe everything is probabilistic no matter how well you research, hence why not let statistics and computers do their jobs as they are more scalable and suitable in a probabilistic world, but this is a story for another time).
Therefore, in my previous two articles (linked
here), I was looking at how two fundamental features, namely Debt-to-Equity Ratio and Operating Cashflow, could be used in the forecasting of next year's dividends. They were interesting, but to be truly useful, I felt that they needed to be made into a predictive model. More precisely, we would like to have a model that returns P("same or more dividends next year" | "predicted to be giving same or more dividends next year"), which I have coined
Hence, in this article, I would like to introduce a model that attempts to produce an estimation of Dividend Strength. Dividend Strength, by my definition, is the likelihood a stock would give same or more dividends the following year according to the model. (e.g. if Dividend Strength is stated to be 90%, it means that our statistical model expects the company to be able to give the same or higher dividends the following year with a 90% chance.)
The model does this by learning from history how stocks with certain traits found in its financial statements would influence its dividends payout the following year statistically. See below if you are interested in the technical details of how the model is built.
Dividend Strength is now available in several places (e.g.
iSuggest) in SGXcafe and I intend to integrate it even further. I am thinking about how I could use it to possibly place a price tag on a company such as using
Dividend Discount Model or something along that line. Also, if Dividend Strength proves to be useful, I might also consider developing similar models such as Earnings Strength, how likely the company earnings will continue to grow, etc.
Hope you will like Dividend Strength and feel free to give me your feedback on it!
Disclaimer: Use Dividend Strength at your own risk :)
This is more for my future reference. It is totally fine to skip this section.
Optimizing Metric: I chose the Area Under ROC (aka AUC) as the single metric to choose between various possible feature set and model.
Feature Generation: I tried different combinations of feature sets up to 60k+ features at one point. Basically, beyond a certain number of features, having more features both increased the running time and slightly decreased AUC. So the final chosen feature set is a combination of 884 fundamental features that is either the value of a fundamental data, change value of a fundamental data, magnitude change of a fundamental data, or combination of two values of a fundamental data via subtraction or division.
Feature Selection: I tried four different selection algorithms namely:
FCBF and Top K with base measures using Mutual Information, Symmetrical Uncertainty, T-Test, and Wilcoxon signed-rank test. Finally, RBF is performing consistently at the top with somewhat similar number of selected features under all conditions regardless of the size of features while FCBF tends to be too strict and often selects too little features in the end, whereas T-Test and Wilcoxon signed-rank test often select too many features and significantly increases the runtime (not to mention they have more parameters to tune).
Classifiers: Again, I tried a few different possible ones: C4.5, Support Vector Machine, Naive Bayes and Random Forest (100), and Bagging with SVM (100). It is the combination of RBF with tree based classifiers that outperforms the rest, which I believe is due to the nature of RBF where redundant features are removed. Hence, Random Forest (100) was chosen.
Model Score to Dividend Strength: Finally, it is not meaningful to simply give the model score on any particular stock, hence I converted it into something I call Dividend Strength which really is P("true positive" | "predicted positive") based on ten fold cross-validation. This is why the "worst" possible Dividend Strength will be "total positive count" / "total count" which is approximately 63% in my dataset. Note: positive here means giving the same or higher dividends the following year.
i) Ten classifiers with different sets of features which were built during the ten fold cross-validation are actually used to generate a mean model score for a new sample.
ii) While it might be more ideal if I had kept a separate dataset to check the final model performance, given that I only had 1,641 data points, I had decided to let the true test be based on 2016's dividends results.
iii) This model is updated weekly and hence the Dividend Strength of stocks would change only weekly.
iv) I will build something later that will track how well this model is performing.
v) The results of the ten fold cross-validation on the selected approach is only an AUC of 0.65, which is low compared to other classification problems. I am not sure how much further I can go because I do not know what is the state-of-art as I did not really survey the literature. However, from another perspective, it is 30% (65 / 50) better than random in choosing better performing dividend stocks.
Next Article >
< Previous Article
Do You Keep Your Extra Cash in ...
Growing Dividends - Does Debt-to-equity ...
List All Articles
Other articles by evankoh
New Exchange Added: Support for Tokyo Stock Exchange!
I am happy to share that StocksCafe can now fully support Tokyo Stock Exchange! You should be able to use all the features that are available for Tokyo as well. This brings the total number of supported exchanges to six (SGX, HKEX, KLSE, USX, LSE and TSE!) One thing to note though is that I was unable to locate good English sources for news and blogs that covers the Tokyo market. So if you know of ...
Personalized grouping of Stocks
A feature that has been requested by various users is the ability to tag stocks. With that, one would be able to group stocks by their own definitions and a wide variety of possibilities would open up. One specific application would be to look at what percentage of your portfolio is in various groupings (for example, you can now see what percentage of your portfolio are made up of dividend stocks, ...
Recent Updates: Email Timing and Others
While StocksCafe is constantly fixing bugs and adding new features, once in a while, I would like to highlight and explain some new features or changes. The following are recent updates which I believe deserve to be highlighted: 1. Market Update Emails Market update emails is one of the more popular features in StocksCafe. In the past, it was catered solely for the Singapore market thus over the weekend, ...