Subscribe to RSS feed
posted on 23 Feb 2016  -  25,588 views
Like many others, I enjoy buying stocks that pay good dividends.
The challenge is in identifying companies that will be able to pay good dividends year after year.
Many might suggest reading through years of financial statements, as well as getting to know the top management and their vision for the company as it is the ideal way of projecting its future cashflow and thus, dividends.
I do agree that that might be a good way, but the
statistician part of me prefers to view it as a statistical game (because looking forward, I believe everything is probabilistic no matter how well you research, hence why not let statistics and computers do their jobs as they are more scalable and suitable in a probabilistic world, but this is a story for another time).
Therefore, in my previous two articles (linked
here), I was looking at how two fundamental features, namely Debt-to-Equity Ratio and Operating Cashflow, could be used in the forecasting of next year's dividends. They were interesting, but to be truly useful, I felt that they needed to be made into a predictive model. More precisely, we would like to have a model that returns P("same or more dividends next year" | "predicted to be giving same or more dividends next year"), which I have coined
Hence, in this article, I would like to introduce a model that attempts to produce an estimation of Dividend Strength. Dividend Strength, by my definition, is the likelihood a stock would give same or more dividends the following year according to the model. (e.g. if Dividend Strength is stated to be 90%, it means that our statistical model expects the company to be able to give the same or higher dividends the following year with a 90% chance.)
The model does this by learning from history how stocks with certain traits found in its financial statements would influence its dividends payout the following year statistically. See below if you are interested in the technical details of how the model is built.
Dividend Strength is now available in several places (e.g.
iSuggest) in SGXcafe and I intend to integrate it even further. I am thinking about how I could use it to possibly place a price tag on a company such as using
Dividend Discount Model or something along that line. Also, if Dividend Strength proves to be useful, I might also consider developing similar models such as Earnings Strength, how likely the company earnings will continue to grow, etc.
Hope you will like Dividend Strength and feel free to give me your feedback on it!
Disclaimer: Use Dividend Strength at your own risk :)
This is more for my future reference. It is totally fine to skip this section.
Optimizing Metric: I chose the Area Under ROC (aka AUC) as the single metric to choose between various possible feature set and model.
Feature Generation: I tried different combinations of feature sets up to 60k+ features at one point. Basically, beyond a certain number of features, having more features both increased the running time and slightly decreased AUC. So the final chosen feature set is a combination of 884 fundamental features that is either the value of a fundamental data, change value of a fundamental data, magnitude change of a fundamental data, or combination of two values of a fundamental data via subtraction or division.
Feature Selection: I tried four different selection algorithms namely:
FCBF and Top K with base measures using Mutual Information, Symmetrical Uncertainty, T-Test, and Wilcoxon signed-rank test. Finally, RBF is performing consistently at the top with somewhat similar number of selected features under all conditions regardless of the size of features while FCBF tends to be too strict and often selects too little features in the end, whereas T-Test and Wilcoxon signed-rank test often select too many features and significantly increases the runtime (not to mention they have more parameters to tune).
Classifiers: Again, I tried a few different possible ones: C4.5, Support Vector Machine, Naive Bayes and Random Forest (100), and Bagging with SVM (100). It is the combination of RBF with tree based classifiers that outperforms the rest, which I believe is due to the nature of RBF where redundant features are removed. Hence, Random Forest (100) was chosen.
Model Score to Dividend Strength: Finally, it is not meaningful to simply give the model score on any particular stock, hence I converted it into something I call Dividend Strength which really is P("true positive" | "predicted positive") based on ten fold cross-validation. This is why the "worst" possible Dividend Strength will be "total positive count" / "total count" which is approximately 63% in my dataset. Note: positive here means giving the same or higher dividends the following year.
i) Ten classifiers with different sets of features which were built during the ten fold cross-validation are actually used to generate a mean model score for a new sample.
ii) While it might be more ideal if I had kept a separate dataset to check the final model performance, given that I only had 1,641 data points, I had decided to let the true test be based on 2016's dividends results.
iii) This model is updated weekly and hence the Dividend Strength of stocks would change only weekly.
iv) I will build something later that will track how well this model is performing.
v) The results of the ten fold cross-validation on the selected approach is only an AUC of 0.65, which is low compared to other classification problems. I am not sure how much further I can go because I do not know what is the state-of-art as I did not really survey the literature. However, from another perspective, it is 30% (65 / 50) better than random in choosing better performing dividend stocks.
Next Article >
< Previous Article
Do You Keep Your Extra Cash in ...
Growing Dividends - Does Debt-to-equity ...
List All Articles
Other articles by evankoh
Useful Links for SGXcafe Users
As the amount of content in SGXcafe continues to grow, it is increasing hard for users to easily keep track of everything that is happening. SGXcafe wall is the first effort to make it easier for users to catch up with the latest happenings, especially on the go (via mobile view). Although the wall is customizable to your liking, only 42 users actually tailor it. Hence I would like to introduce two ...
Open Invitation to Financial Bloggers and Quick Feature Updates
As StocksCafe is currently expanding beyond Singapore to other markets in Asia, I am now looking to feature financial bloggers from around Asia. If you are a financial blogger or if you know of any financial blogger that writes about stocks or financial related topics focusing on Asia, please let me know either via private messaging or by simply commenting on this article. In other updates, I would ...
At the moment, one of the most highly voted features in Friends of SGXcafe is "Use various known methods to compute the intrinsic value of stocks". As much as I want to do it, it is not easy to compute the real "intrinsic" value of stocks, assuming it even exists. All methods that I have found require subjective opinions of the future. And often, small deviations in assumed values influence the results ...