Subscribe to RSS feed
posted on 23 Feb 2016  -  7,681 views
Like many others, I enjoy buying stocks that pay good dividends.
The challenge is in identifying companies that will be able to pay good dividends year after year.
Many might suggest reading through years of financial statements, as well as getting to know the top management and their vision for the company as it is the ideal way of projecting its future cashflow and thus, dividends.
I do agree that that might be a good way, but the
statistician part of me prefers to view it as a statistical game (because looking forward, I believe everything is probabilistic no matter how well you research, hence why not let statistics and computers do their jobs as they are more scalable and suitable in a probabilistic world, but this is a story for another time).
Therefore, in my previous two articles (linked
here), I was looking at how two fundamental features, namely Debt-to-Equity Ratio and Operating Cashflow, could be used in the forecasting of next year's dividends. They were interesting, but to be truly useful, I felt that they needed to be made into a predictive model. More precisely, we would like to have a model that returns P("same or more dividends next year" | "predicted to be giving same or more dividends next year"), which I have coined
Hence, in this article, I would like to introduce a model that attempts to produce an estimation of Dividend Strength. Dividend Strength, by my definition, is the likelihood a stock would give same or more dividends the following year according to the model. (e.g. if Dividend Strength is stated to be 90%, it means that our statistical model expects the company to be able to give the same or higher dividends the following year with a 90% chance.)
The model does this by learning from history how stocks with certain traits found in its financial statements would influence its dividends payout the following year statistically. See below if you are interested in the technical details of how the model is built.
Dividend Strength is now available in several places (e.g.
iSuggest) in SGXcafe and I intend to integrate it even further. I am thinking about how I could use it to possibly place a price tag on a company such as using
Dividend Discount Model or something along that line. Also, if Dividend Strength proves to be useful, I might also consider developing similar models such as Earnings Strength, how likely the company earnings will continue to grow, etc.
Hope you will like Dividend Strength and feel free to give me your feedback on it!
Disclaimer: Use Dividend Strength at your own risk :)
This is more for my future reference. It is totally fine to skip this section.
Optimizing Metric: I chose the Area Under ROC (aka AUC) as the single metric to choose between various possible feature set and model.
Feature Generation: I tried different combinations of feature sets up to 60k+ features at one point. Basically, beyond a certain number of features, having more features both increased the running time and slightly decreased AUC. So the final chosen feature set is a combination of 884 fundamental features that is either the value of a fundamental data, change value of a fundamental data, magnitude change of a fundamental data, or combination of two values of a fundamental data via subtraction or division.
Feature Selection: I tried four different selection algorithms namely:
FCBF and Top K with base measures using Mutual Information, Symmetrical Uncertainty, T-Test, and Wilcoxon signed-rank test. Finally, RBF is performing consistently at the top with somewhat similar number of selected features under all conditions regardless of the size of features while FCBF tends to be too strict and often selects too little features in the end, whereas T-Test and Wilcoxon signed-rank test often select too many features and significantly increases the runtime (not to mention they have more parameters to tune).
Classifiers: Again, I tried a few different possible ones: C4.5, Support Vector Machine, Naive Bayes and Random Forest (100), and Bagging with SVM (100). It is the combination of RBF with tree based classifiers that outperforms the rest, which I believe is due to the nature of RBF where redundant features are removed. Hence, Random Forest (100) was chosen.
Model Score to Dividend Strength: Finally, it is not meaningful to simply give the model score on any particular stock, hence I converted it into something I call Dividend Strength which really is P("true positive" | "predicted positive") based on ten fold cross-validation. This is why the "worst" possible Dividend Strength will be "total positive count" / "total count" which is approximately 63% in my dataset. Note: positive here means giving the same or higher dividends the following year.
i) Ten classifiers with different sets of features which were built during the ten fold cross-validation are actually used to generate a mean model score for a new sample.
ii) While it might be more ideal if I had kept a separate dataset to check the final model performance, given that I only had 1,641 data points, I had decided to let the true test be based on 2016's dividends results.
iii) This model is updated weekly and hence the Dividend Strength of stocks would change only weekly.
iv) I will build something later that will track how well this model is performing.
v) The results of the ten fold cross-validation on the selected approach is only an AUC of 0.65, which is low compared to other classification problems. I am not sure how much further I can go because I do not know what is the state-of-art as I did not really survey the literature. However, from another perspective, it is 30% (65 / 50) better than random in choosing better performing dividend stocks.
Next Article >
< Previous Article
Do You Keep Your Extra Cash in ...
Growing Dividends - Does Debt-to-equity ...
List All Articles
Other articles by evankoh
Should You Avoid High P/E?
It has often been advocated to avoid any stocks with a high P/E. The common argument is that you would be overpaying for hype stocks. However, overpaying or not depends on how the high P/E stocks behaves in future. For example, if high P/E stocks actually do grow at a rate that is significantly higher than low P/E stocks, then it is arguably justifiable to buy stocks with a high P/E. One real-life ...
Quick Features' Update: Multiple Portfolio in Emails, New Portfolio Profile Page, and more
I had some time this weekend and managed to finish up a few user requests, and would like to share them: 1) Individual Portfolio in Daily Market Update Emails Now you can opt to view the performance of each portfolio you own by simply going to this page and checking the "Individual Portfolio - Information about each portfolio you own" box. You will see the performance of your individual portfolios ...
Easily Scan for Trending Stocks
Many Stock Screeners (including the one on SGXcafe previously) only allows you to screen for stocks based on the current value of metrics. However, we are sometimes more interested in the trend of the metric than its current value. Hence, I have recently added more than 50 metrics trends to SGXcafe. At the moment, you are able to use them in two main places. 1) In Screener You can now use SGXcafe screener ...