A Recurrent Neural Network Framework for Predicting Asset Returns 


The investing profession is unlike most other professions. To achieve excellent performance you need to act differently than the majority. If a group of proficient engineers collaborate in building a light bulb, they’ll probably come up with a well designed light bulb. This is because there is a solution to this problem that doesn’t depend on other light bulb designs. A light bulb is not a complex adaptive system. In the markets, a similar group of collaboration would not lead to an excellent outcome. Whatever the majority perceives to be right gets priced into the market. It is therefore necessary that actions of the excellent professional are contrary to the opinion of the majority. Of course, this is not yet a sufficient set of conditions. Importantly, the conclusions giving rise to the actions have to be in some kind of way better. I have no doubt the reader realizes that the first element of this argument is contained in the second one, as ‘better’ necessarily implies ‘different’. I think it is useful nonetheless to state the condition explicitly as to not set oneself up for failure. In a strict sense, all the efforts directed to reading books, blogs, news, analyses, and basically anything else by other people would be wasted, not even pointing out the fact that in many cases if they really had this informational edge, why would they sell it to you instead of acting on it themselves gaining much more in the process. I think the argument in its strict form is not always true as e.g. value investing capitalizes on superiority of emotional stamina instead of an informational advantage.

Instead of following well known investing strategies, I’m focussing my efforts on developing systems that generate truly unique strategies. For that it’s useful to state fundamental principles that excellent investing strategies must abide to, and reasoning from there upwards.

  1. Diversification with uncorrelated (or even better negatively correlated) return streams
  2. Compounding
  3. Reducing the risk of catastrophic losses to practically zero
  4. Having an edge

The standard deviation of a portfolio’s return stream declines with the square root of the number of uncorrelated assets (n) within that equally weighted portfolio. It is critically important that these assets — or rather the income streams from trading these assets, which are not the same thing — are uncorrelated. If you had a portfolio of 30 equally weighted assets, and these assets would give you income streams 60% correlated with each other and each stream had a Sharpe ratio — the expected excess rate of return divided by its standard deviation— of 0.2, then the portfolio Sharpe ratio would be: Sharpe_c = n*0.2/(n + 2*n(n − 1)/2 *0.6)^0.5 =0.255. Combining correlated assets only results in a small reduction in risk. Contrast this with the Sharpe ratio assuming uncorrelated assets: Sharpe_u = n*0.2/(n)^0.5 = n^0.5 * 0.2 = 1.095. Here the covariance part of the variance vanishes. The result is a much higher reduction in risk. Furthermore, asymptotically, as n gets large, the risk approaches zero in the uncorrelated case but not in the correlated case. To illustrate, if you would include 100 assets in the portfolio instead of 30, you would get a Sharpe ratio of Sharpe_u = 100^0.5*0.2 = 10*0.2 = 2 in the scenario where the return streams are uncorrelated. Again assuming 60% pairwise correlation, the Sharpe ratio is a mere Sharpe_c = 0.257 for the n=100 portfolio, almost no improvement against the n=30 portfolio.

As I, and this blog, started out with the value investing philosophy, I will shortly run this philosophy by the aforementioned principles. The core principles of value investing match point 2, 3, and 4 — point 1, however, is at odds. Since most value investors are long only stock pickers, diversification is a serious challenge for them. First since a comprehensive fundamental analysis is necessary to evaluate whether or not the investor has an edge in his favor, he will soon reach the limit of his capacity. It is simply not feasible to evaluate thousands of opportunities to come up with a selected basked of, say, 100 stocks without compromising quality. But even if he could, it would not result in true diversification as stocks are correlated and as n gets large the total variance is dominated by the covariance part. This is the reason why many value investors declare diversification beyond a certain threshold — typically between 5 and 50 — as useless and instead focus their funds on their most cherished ideas. If thats not enough to convince you that value investing, as it is practiced by most individuals, has a problem with diversification, consider the fact that Asness et al. (2013) found that the value premium is correlated across asset classes. So even if a more sophisticated value investor would consider a long/short value strategy across even uncorrelated asset classes, the transformation his investment strategy would apply to the asset return streams would yield correlated portfolio constituents. To be clear, I don’t discount value investing as a valuable part of a portfolio. It just can’t be an optimal strategy on its own.

Let’s return to the fundamental principles. The second principle is compounding. As Albert Einstein reportedly said  “Compound interest is the eighth wonder of the world. He who understands it, earns it … he who doesn’t … pays it.” Simply put 10% compounded for 100 periods is not 100*10% = 1,000% but 1.1^100-1 = 1,378,000%.

The third principle can be summarized as ‘everything times zero is zero’. It doesn’t matter how stellar one’s track record was, if there is one devastating year, it was all for nought.

The fourth principle is to have an edge. What is meant by this is basically that one has some kind of advantage over one’s competitors such that the expected value of one’s efforts is positive. The easiest —but not the only— way to gain such advantages is to carefully select one’s competitors. The harsh truth about trading the secondary markets is that you have to take the money from someone. It should be arguably more probable to have an edge against less sophisticated investors than against sophisticated professionals. Luckily for the individual trader there are some obvious ways to get out of the way of the most sophisticated professionals. Historically this blog focused on small, obscure stocks that provide opportunities that are not economical for the professional to exploit. Another way is to trade intraday, arguing that there is not enough liquidity in this timeframe for larger funds to employ similar strategies. I’m going to argue that the latter approach is more rewarding as it also allows for more frequent trades and thus more occurrences for the magic of compounding to do its thing.

Where does this rumination lead us? We need a system that generates many uncorrelated trading strategies, ideally trading very often. These strategies then are bundled with proper risk management into a portfolio. To his end I’ve developed a program that takes any data (price, fundamental, sentiment, satellite image data, etc.) and generates a trading strategy for a specified trading frequency. The core of this program is a recurrent neural network (RNN). More specifically, I use Long Short Term Memory (LSTM) units. LSTMs are a specific type of RNN that provide a solution to the vanishing gradient problem as shown by Hochreiter and Schmidhuber (1997). Basically, LSTM cells can look further into the past as traditional RNNs. LSTM networks are one important driver behind the recent advances in machine translation and speech recognition to give only two examples. For more examples read: http://karpathy.github.io/2015/05/21/rnn-effectiveness/.

As a test, I provided the framework only OHLC and datetime information for the EURUSD forex pair. The model extracted a Sharpe ratio of >5 and a CAGR of 60%. I’m the first one to point out that this is not an outcome you should expect to achieve in reality. But the model only gets price data, data that many believe has no predictive power whatsoever. For more information about this project refer to my Github page: https://github.com/jpwoeltjen.


Ugly vs. Pretty Value Stocks

As Warren Buffett rightly says, value and growth are joined at the hip.[1] It seems like a perfectly sensible strategy to pay more for high-quality businesses than for low-quality deep value stocks. And it is… in theory. In practice, it is extraordinarily difficult to find the right trade-off. There has been done some systematic research trying to improve value strategies by including a quality component. Quality, here, means anything one should be willing to pay for (e.g., ROE, ROIC, growth, profitability, etc.) And some of these studies show very counter-intuitive results.

A prominent example of a strategy that supplements a pure value ranking by a quality measure is Joel Greenblatt’s Magic Formula. In their outstanding book “Quantitative Value”, Wesley Gray and Tobias Carlisle show that the quality component actually decreases the performance of a portfolio based on a value ranking alone.[2] The likely reason for this is the mean reverting nature of return on capital, the used quality measure. Economic theory dictates increased competition if companies demonstrate high returns of capital, and exits of competitors if returns are poor. The new competition decreases returns for all suppliers. Exits of competitors increase returns for prevailing businesses. Betting on businesses with historically high returns seems like a bad idea on average, then.

As ambitious bargain hunters, we try to find high-quality businesses at low prices. Studies, however, show that (in competitive markets) valuation is far more important than quality. And in some cases, due to naïve extrapolation of noise traders, quality is actually associated with lower returns. Lakonishok, Shleifer and Vishny (1994) study exactly that. Their results fly in the face of many investors working hard to find the ‘best’ value stocks. Lakonishok et al. construct value portfolios not only based on current valuation ratios but also on past growth. They define the contrarian value portfolio as having a high Book-to-Market ratio (B/M) — the inverse of P/B — and low past sales growth (GS). The reasoning behind this is that by Lakonishok’s et al. definition, value strategies exploit other investor’s negligence to factor reversion to the mean into their forecasts. This is a form of base rate negligence, a tendency in intuitive decision-making found by Kahneman and Tversky (1982).[3] Lakonishok et al. thus identify stocks with low expected future growth (valuation ratio) and low past growth (GS) that indicate naïve extrapolation of poor performance. They show that this definition of value performs better than a simple definition based only on a valuation ratio (e.g., B/M.) Another way of looking at this is by subdividing the high B/M further into high and low past growth. The low past growth stocks outperform the high past growth stocks by 4% p.a. (21.2% vs. 16.8% p.a.) while the B/M ratios of these sub-portfolios “are not very different.” [4]

In his excellent book “Deep Value”, Tobias Carlisle shows insightful statistics for these portfolios. The incredible insight is that even if valuation ratios are practically the same, stocks that rank low on quality (past sales growth) perform better than high-quality stocks. One likely reason is mean reversion in fundamentals. contrarian investment

Source: Carlisle, Tobias: Deep Value: Why Activist Investors and Other Contrarians Battle for Control of “Losing” Corporations, 2014, John Wiley & Sons, Inc., Hoboken, New Jersey, pp. 131-132.

Similar results are also showing in the deepest of value strategies: net-nets. Oppenheimer (1986) shows that loss-making net-nets outperformed profitable net-nets (36.2% p.a. vs. 33.1%), and non-dividend-paying net-nets outperformed dividend-paying net-nets (40.6% vs. 27.0%) from 1970 to 1983. Carlisle confirms these results out of sample from 1983 to 2010.[5] My own backtests confirm these results from 1999 to 2015. My results at least are, however, mainly driven by the higher discount — profitable businesses don’t usually trade at large discounts to NCAV.

Whether you are a full quant or not, if you are trying to pick the ‘best’ stocks from a value screen you are likely making a systematic mistake — unless you are searching for businesses with moats (i.e., a sustainable competitive advantage that prevents a high return on capital to revert to the mean.) But good luck finding such a business in deep value territory consistently.

Regression to the mean is such a strong tendency and is systematically underestimated by market participants that just betting on historically poorly performing businesses outperforms the market.[6] Bannister (2013) finds that betting on “unexcellent” companies (ranking low on growth, return on capital, profitability) outperformed the market from 1972 to 2013 (13.74% p.a. vs. 10.59%). A portfolio constructed of stocks of “excellent” businesses, in turn, underperformed the market (9.77%).[7]

I still think good quality measures (i.e., measures that do not implicitly bet against regression to the mean in fundamentals) are a potent tool for improving a value ranking. It is, however, not as easy as layering a quality screen blindly over a value screen and thereby imply equal weights. A category of quality measures that is of special interest to me is distress/bankruptcy prediction. But even if the such a measure is very good at identifying value traps, there is still the very serious issue of false positives. That is, excluding stocks that actually perform well on average. A too sensitive measure will likely exclude all the ugliest stocks that perform the best. More research is needed to determine a sensible weighting mechanism. The merit of such a measure is dependent on the false negative error rate, false positive error rate, the cost of false negatives, and, importantly, on the cost of false positives. The cost of false positives may be very high for concentrated portfolios. Even if in studies the quality measure can improve performance, that doesn’t mean that it will improve a concentrated value portfolio (20-30 stocks). The reason is that these studies often hold a very diversified portfolio (e.g., a decile). This is quite a number of stocks. If the quality factor excludes 20 extremely cheap stocks, it’s not a big deal. If you were to hold the 30 cheapest stocks in the universe, however, and the quality factor excludes 20 of them and the next cheapest stocks have 2 times the valuation ratio, it is very likely that the performance will suffer. It will dilute the value factor too much. The important thing is to actually backtest your portfolio and not just rely on studies.

Another interesting area of research lies in identifying moats that prevent mean reversion of high return businesses. That, however, still leaves the question open if these businesses are systematically undervalued.


[1] Buffett, Warren: Letter to the Shareholders of Berkshire Hathaway Inc., 1992.

[2] Gray, Wesley and Carlisle, Tobias: Quantitative Value: A Practitioner’s Guide to Automating Intelligent Investment and Eliminating Behavioral Errors, 2013, John Wiley & Sons, Inc., Hoboken, New Jersey, Table 11.1.

[3] Kahneman, Daniel and Tverky, Amos: Intuitive Prediction: Biases and Corrective Procedures, in D. Kahneman, P. Slovic, and A. Tversky, Eds.; Judgment Under Uncertainty: Heuristics and Biases, 1982, Cambridge University Press, Cambridge, England.

[4] Lakonishok, Josef, Shleifer, Andrei and Vishny, Robert: Contrarian Investment, Extrapolation, and Risk, The Journal of Finance 49, no. 5, 1994, p. 1555. http://www.jstor.org/stable/2329262 OR http://lsvasset.com/pdf/research-papers/Contrarian-Investment-Extrapolation-and-Risk.pdf

[5] Carlisle, Tobias: Deep Value: Why Activist Investors and Other Contrarians Battle for Control of “Losing” Corporations, 2014, John Wiley & Sons, Inc., Hoboken, New Jersey, p. 133.

[6] Lakonishok, Josef, Shleifer, Andrei and Vishny, Robert: Contrarian Investment, Extrapolation, and Risk, The Journal of Finance 49, no. 5, 1994, p. 1549. http://www.jstor.org/stable/2329262 OR http://lsvasset.com/pdf/research-papers/Contrarian-Investment-Extrapolation-and-Risk.pdf

[7] Bannister, Barry, Stifel Financial Corp., and Eyquem Investment Management LLC from  Carlisle, Tobias: Deep Value: Why Activist Investors and Other Contrarians Battle for Control of “Losing” Corporations, 2014, John Wiley & Sons, Inc., Hoboken, New Jersey, p. 137.

Limits of Arbitrage

Many value investors acknowledge that there are many other smart traders, but believe these other traders somehow don’t understand value investing. It appears, a lot of value investors are hugely overconfident when it comes to their special insight, i.e., that value investing works and others just don’t get it. Yet, there is overwhelming evidence that value investing does work and continuous to work even after a lot has been written about it. So, why does the value premium persist? Fortunately, there are better explanations than ignorance. Behavioral finance tries to explain the outperformance of value strategies by differentiating between noise traders, arbitrageurs, and their clients. On the one hand, there has to be someone who, probably due to some bias, e.g. extending the recent negative earnings trend too far into the future and thereby ignoring regression to the mean, sells an asset at a price below fundamental value (the noise trader). On the other hand, there has to be some reason why professional traders with vast resources do not arbitrage this price/value gap away immediately. This is crucial but often ignored. Shleifer and Vishny (1997) explore a possible reason why mispricings may occur even if specialized arbitrageurs are knowledgeable and rational.[1] They do this by assuming that the arbitrageur and the owner of the invested money are two separate entities. According to their model, the arbitrageur’s clients update their prior beliefs about the arbitrageur’s competence by incorporating the recent performance of investments in their assessment. Understanding the limits of arbitrage can help us separating undervalued assets from superficially cheap but not actually underpriced assets.

The key insights are:

  • In academia, arbitrage is typically defined as riskless without the need of capital. In practice, however, it does require capital (usually part of it from outside investors) and is associated with several forms of risk.
  • Arbitrage is typically performed by specialized traders who are not well diversified.
  • Especially in value situations, assets can further decline in price in the short run, even if it is a good bet long-term.
  • Clients do not have perfect knowledge of the arbitrageur’s competence. It can thus be a rational choice to withdraw capital from an underperforming manager. This forces the manager to sell off assets, even though the expected return actually increased after the price drop.
  • An agency problem breaks down the link between greater mispricing and higher expected return from the client’s perspective.
  • This can result in irrational prices while the arbitrageurs and their clients themselves act rationally.

In which situations is arbitrage most limited then?

  • First and foremost: Small Size. The absolute dollar amount that can be earned arbitraging in extremely small situations is just too small to make the return on invested resources attractive for professional fund managers. This just leaves individual investors. But in the smallest situations, even these investors have to be either inexperienced or so far unsuccessful, or they would have gathered enough capital to make it uneconomical for them as well. Sounds like weak competition to me!
  • For professional fund managers, very volatile markets increase the risk of looking incompetent in the short term. Therefore, all else equal, we should expect less arbitrage in volatile markets.
  • The risk of further price declines is greater in situations that take a longer time to play out and are unpredictable. Hence, we should see more arbitrage in strategies that play out (at least partially) before clients can withdraw capital and less when there may be many months or even years of underperformance before the manager is eventually proven right.
  • Building on that point, this effect should be more severe for assets where there is clearly something wrong — the typical deep value stock. On average, betting on value stocks can be a good idea, but you can look extremely incompetent on any given investment. Hindsight bias compounds this issue. Value situations that do not play out look like stupid investments in hindsight. It might thus be a good idea, as an individual investor, to explicitly focus on situations where one can look extremely incompetent or neglecting on any individual investment to an ignorant outsider (incompetent or self-serving corporate insiders, industry downturn, loss of major customer, negative earnings trend, regulatory issues, etc.)

Of course, identifying areas where mispricings are likely is itself not a viable investment approach. Assets can be undervalued as well as overvalued. But combined with a value ranking, e.g. EV/EBIT, searching in less efficient markets can reduce the risk of buying statistically cheap stocks which prices are actually justified. This is a completely different approach of trying to exclude value traps than the typical qualitative assessment.

[1] Shleifer, Andrei, and Vishny Robert W. “The Limits of Arbitrage.” The Journal of Finance 52, no. 1 (1997): 35-55. http://links.jstor.org/sici?sici=0022-1082%28199703%2952%3A1%3C35%3ATLOA%3E2.0.CO%3B2-3