June 5, 2013
Last week, while I was looking for some software routines, I started trying out some of my old programs to see if visually it would be easier to locate them. These programs had not been executed for over two years, and out of curiosity I also wondered how they would have done. It would be like a walk-forward test and an out-of-sample test at the same time since none of the data could possibly have been known to these programs outside their initial testing intervals.
In my search, I started at the top of the list with the ADD3 series. I have 17 versions in the ADD3 series, each incrementally better than their predecessors, except for version 17, which has a bug that I don't have a need or the time to fix. ADD3 is the rare program where I have that many versions. I usually only keep two at best: my latest working version and my prior best. But most often, only my latest, which incidentally meant that I might have to go through over 200 trading scripts to find what I wanted.
Finding a software routine is not the same as finding a chapter in a book; not only do you need to understand the code but also its interaction with the other modules in your program. Otherwise, how would you know that what you are looking at is what you are looking for... Two years after coding something, when you have done so many programs, it is hard to remember the reason why such and such code is the way it is. Nonetheless, if I want to find these routines, I'll have to go through the programs.
What the executed ADD3 tests showed was that my search for my software routine would indeed take some time. After two years, I could hardly remember what the program was doing, what were the procedures and routines employed and most importantly, what were the relations and impact of the sum of all these trading decisions. Sure, the program, if run, would say if it stayed profitable and to what extent. But it would not give me if that program contained the routines that I was looking for. I would have to dig even deeper into these programs to find my routines.
Nevertheless, the tests showed some interesting characteristics. First, the trading scripts had been designed to operate over a 6-year trading interval: from Q1 2005 to Q1 2011. In doing these new tests, I opted for a 25-year period, of which 19 years of data would not have been seen by the program. It would be testing unseen data from both ends of the trading interval at the same time. It would say how the program would have behaved before and after its original 6-year testing period.
After reaching ADD3 version #10, I opted to display the test results on a LinkedIn forum where I sometimes participate. The test had rather impressive metrics:
Summary Performance Results: IBM (version #10 over 25 years)
(click to enlarge)
The above performance summary is on IBM from 1988 to the present (25 years). So a sufficiently long period to show the program's merits. It generated 4,733 trades over the period: a sufficiently large number of trades to make it relevant. An 85.1% win rate with an average holding period of about 1,340 trading days. The average stop loss for the 705 losing trades (14.9%) was about $1,107, representing, on average, a -13.3% decline per losing trade. All are quite interesting numbers.
I would like to highlight two other metrics of that test:
Profit factor: ........ 32.27
Payoff ratio: ........ 11.01
Both these numbers are considered impossible by most, and yet this old program not only did not break down with the added 2 years; it would have performed quite well over the past 25 years. And this, even if the program was originally tested on only 6 years of data: from Q1 2005 to Q1 2011. It showed itself to be profitable before and after its designed and tested trading interval.
Here I had this old program, doing what it was designed to do: survive what is coming next. It did so remarkably well with impressive metrics. I was left with more ADD3 program versions. If they were also designed to survive and were "improvements" over their predecessors, then they too, should also outperform prior versions. There is only one way to show this, and it is to do the test: run version #16 on the same 25-year trading interval:
Summary Performance Results: IBM (version #16)
(click to enlarge)
Since these "improvements" were designed to outperform versions #10 to #15, it was no surprise to see better metrics:
Number of trades went from 4,733 to 6,977.
Losing trades went from 705 to 565.
%Win ratio went from 85.1% to 91.9%
%Loss ratio went from 14.9% to 8.1%
The average loss on losing trades went from -$1,107 to -$884.
The profit factor went from 32.27 to 111.94
Payoff ratio from 11.01 to 12.83.
All this means that each improvement in previous iterations of the program incrementally helped increase overall performance, not only on its original 1,500 trading days simulation but also on its new 6,223 days test (25 years). Not bad... But still, have not found the routines I was looking for.
My conclusions following these two tests would be to challenge some old market folklore and clichés; and answer:
Yes, programs can perform for a long time.
Yes, they can have high-profit factors.
Yes, they can have high payoff ratios.
Yes, they can have high win rates.
Yes, they can be highly profitable,
and,
Yes, it can be done.
I would add that it can be done by anyone.
Added June 5
I did a portfolio test on the ADD3 script using version #17 back in July 2011 (most probably before it developed a bug, which would indicate that its problem is minor and easy to fix). So the program was executed a little less than 2 years ago and not over 2 years ago. The simulation results can be found under the simulation menu.
IBM was selected for the present tests for the reason that, in 1988, it could have been a reasonable investment choice. Looking at the July 2011 test results, one would expect the other members of that portfolio to see their respective performance levels increase as well. This technically saves me the time to have to do such a test again.
Created... June 5, 2013, © Guy R. Fleury. All rights reserved.