Jan. 12, 2021

The following post is in reference to a question asked on overfitting in a QuantConnect forum.

Any stock trading strategy designer should have views on this subject since somehow it gets in the way if not at the heart of any such strategy that it be live or simulated. I find overfitting indirectly related to the law of diminishing returns. Meaning that going forward, your trading strategy will produce less over time. However, it can also be viewed in light of another problem, and that is to think that the market will strictly follow our often misconceived and poorly designed trading strategies. It should be forcefully noted that the market has no such obligation.

As a portfolio grows, it becomes harder and harder to just maintain its compounding rate of return, just as it gets harder, under uncertainty, to handle the increasing trading volume needed to generate those profits. Also, just as a very basic consideration, the set of selected parameters for a simulation might not even prevail going forward.

We are all aware of these notions. Nonetheless, I do state that basic common sense should also prevail. Start with what is important, and that has passed the test of time. And then build on those notions.

I agree with the findings of de Prado and his apprehensions about overfitting. This might sound contradictory, but I have read a few hundred of those papers and books to that effect. There are some holes in the whole vision that is projected but I am not to write a thesis on that.

It goes on the side that, eventually, all trading strategies fail. That is not something I agree with. If all trading strategies fail, then why would you ever follow one? Why would you even consider making a simulation of a trading strategy over past market data? The answer is simple, as Jim Simmons once said: “It is the best we have to help us determine what is coming next”.

US Stocks

Every share of every US stock is in the hands of someone, all the time. They can change hands in under a millisecond. Some of those shares have been in some hands for some time (several decades), and what we can observe without even a backtest is that stocks have had, on average, a long-term upward trend. The simple method of just holding a diversified portfolio for a generation or two would have given a positive outcome where all you had to do was sit on your hands.

A good example of a strategy that did not fail is Berkshire Hathaway. It has been there and prospered over the ups and downs of every economic cycle over the last 50+ years under the mantra: “...do not bet against America”. I make the same bet. Therefore, over the long term, I also expect the market, in general, to go up, not down.

I find that premise important as it should guide whatever trading system I might design. If your premise is that, eventually, you will have, on average, higher prices, your betting system should favor long positions. And if you think the market will fail, then short the thing.

Some say that you need predictive powers to outperform. I would dare to advance that it is not even necessary or required. All that is required is to participate in the game, that is: at least, be long when prices are rising and do not be there when not. And so, switching to bonds or equivalents on market downturns appears as a reasonable proposition.

Market Trend

What you really need is “something” that will declare the trend as up or down. And whatever it is, you should have your trading strategy follow suit. Meaning you take longs when “you” declare the market (or the group of stocks you intend to trade) as going up.

The what you use to declare the trend would appear consequential. That is why we do backtests to find out if our vision of things would at least have prevailed over past market data.

It is not enough to design a trading strategy. You also have to go a step further by gaming it.

Portfolio Simulations

Here is where I see a problem. Whatever simulation we do, we are dealing with a one-time occurrence. Any of the stocks we might touch in our trading strategies have only one history. They are all one of a kind. And there appears at times to be no statistical significance, except that we are always at the right edge of the chart. For example: Lehman Brothers took 140 years to reach its historical high and only 9 months thereafter to go bankrupt. Whatever statistics we might have had on Lehman up to the year before its demise were completely useless and, even worse, quite detrimental to any portfolio. Because of scenarios like this, and there are a lot of them, we, as developers, should exercise extreme caution. The data we are fed needs to be validated, or we should play 100 stocks or more at a time in order to reduce the size of our betting functions and, therefore, their impact on our overall portfolio.

Short-term trading is close to simply gambling. It is investing but of a particular kind. We win or lose based on what we selected for the duration of our holding period, whether long or short. And that puts us right back to our bet on America: a positively biased probabilistic trading environment.

My point is: can you game a system better? How far and by how much? Is trading QQQ a real market risk since it is a market proxy? Trading on QQQ is equivalent to trading the 100 stocks it is composed of. Can you slice QQQ into sub-intervals where you take long positions only when you declare its trend as up? And this, whether it be related to the market or not.

Market Trend Proxy

The most basic of questions: what will you use to declare bets on and bets off?

Of the different versions of In & Out, I prefer this one. It is more stable and more risk-averse than the others. I do not mind the drawdowns so much since I consider them inevitable whatever I do. On the other hand, I seek volatility while in a position in order to extract a larger profit.

Version 1.1 had an average win of 6.92% and an average loss of -0.78%. This could be interpreted as: the average trade had an average stop-loss was -0.78% while the average stop-profit was 6.92%. It shows how sensitive to market downturns the strategy was. It did not tolerate, on average, a 1% decline in value on its average position.

You combine this with a win rate of 78%, and I have to declare that there was alpha built-in!

So, as said before, nice work. The strategy did not need to be predictive. At its core, it is a simple and sensitive trend-following system that coincidentally could be viewed as if predictive. It expects the next day to be in the same direction it just declared. Or it assumes that today will be in the same direction as yesterday.

What I did is amply the gaming side of the equation. If you take the case with the 1.4x leverage, the average stop-loss was -1.69%, and the average stop-profit was 15.18%, with a win rate of 77%. I am not surprised that it generated a 506,659% total return on its 100k stake. You could do even better by pushing for more on the governing equations. And that was also illustrated. The last chart presented had a 2,000,000+% total return. This is more than enough to invite back the notion of overfitting.

We have nothing where we can say: this strategy is “fitted”; it is a perfect fit. But, some can declare it is overfitted. Compared to what? Where is the rationale? That other strategies failed. That the future will be different from the past. But we already know that as almost self-evident. A trading strategy will behave differently going forward? Absolutely. My modifications to this trading strategy showed that you could push version 1.1 to go from 8,518% to over 2,000,000%+ total return and anything in between. At what point do you declare it “overfitted”? If it was overfitted at 100,000%, what was it at 500,000%? Do we have a measure of overfittedness?

The range provided here is mind-blowing. You can change the perspective on this strategy, change its objectives, increase and decrease parameters, and it can generate quite a wide range of outcomes. All stuff we have to think about.

Nonetheless, I would classify my highest total return scenarios in the overfitted category. By how much, unfortunately, I do not know. What I do know is that I was able to push the strategy toward its tradable limits without reaching them or having the strategy blow up. The simulations I made on the variations of version 1.1 allowed those outcomes. The charts themselves are a demonstration of that. Furthermore, anyone running exactly the same trading scripts using QuantConnect software and their data would have generated exactly the same charts. That is the beauty of computers running programs, they always return 2+2=4 every time for everyone.

I do think that any stock trading strategy simulated over an extended period of time showing a 40+% CAGR should gain some interest. And even more if it can exceed a 70%+ CAGR.

The real question should be: Would you have chosen those stocks, trading procedures, and parameters in 2008?

Jan. 12, 2020, © Guy R. Fleury. All rights reserved.