## Overview

Many people have tried to use machine learning in various implementations in order to forecast trends and trade securities. John Fawcett and Alex Izydorczyk, among others have discussed the topic on the Quantopian community. Here we’ll walk through a comparison of a simple classification strategy and compare results between different classifiers, using Quantopian’s Zipline backtester to help prototype the strategy.

## Feature Generation

For any security \(s_k\), let vector

$$\textbf{x_k}(t) = [x(t-m), … x(t-1)]$$

be the binary direction in movement of the price, where \(x(t) = 1\) represents an upward movement from \(x(t-1)\), and \(x(t)=-1\) represents a downward movement. Let the target $$y_i(t) = x(t)$$ be the movement in the price on day \(t\). We set \(m\) equal to the number of days we want to incorporate into each training example. For the purposes of this example we set \(m=30\).

Let

$$G(t) = [F(t-n)‘, F(t-n+1)’, … , F(t)‘]’$$

, where

$$F(t) = [x_1(t), x_2(t), … , x_k(t)]‘$$

is the set of features for all traigning examples, where \(x*1, x* 2, … x_k\) represent the directions for separate securities.

To recap, now we have a matrix \(G\) which contains \(n\) separate 5-day recordings of directionality changes for each of \(k\) securities.

For this study we have set \(n=60\), implying that we use a window of \(60\) days for the training and cross validation sets.

Note: should we cross-validate? How should we split up the samples to training and cross validation samples? comment below.

## Backtesting

Let’s use a time period of one year, starting from May 2013 to May 2014. For simplicity of illustration, we’ll use one security, ProShares Large Cap Core Plus (CSM). We’ll compare the performance of the classification method, to the performance of the S&P index.

## Classifiers tested

The classifiers tested include: **Naive Bayes, logistic regression, KNN, random forest**.

Note: how should regularizaion be done? L1? L2? Should we add **SVM** to the list? What Kernel (gaussian, polynomial, etc) should we use for SVM? Comment below.

## Learning Metrics

Here’s what you all are really looking for…Let’s look at **backtesting return, accuracy, AUC, precision , recall**. The time period observed as Jan 3, 2012 to May 9, 2014. The return you would get for simply holding the security was 52%.
Shown below is the backtest for Naive Bayes:

For Logistic Regression:

KNN with 5 neighbors:

Random Forest:

Classifier | Return (%) | Accuracy | Beta | Sharpe |
---|---|---|---|---|

Naive Bayes(Bernoulli) | 16.1% | 48.4% | 0.31 | 0.75 |

Logistic Regression | 15.7% | 48.6% | 0.31 | 0.72 |

KNN | 14.19% | 48.8% | 0.29 | 0.55 |

Random Forest | 15.7% | 48.4% | 0.31 | 0.72 |

## Conclusion

What do you say? Astonishing performance indeed? Are random forests the clear robust winners? If Silicon Valley uses them so much, should we?

How do you measure robust-ness – returns or AUC? The reason metrics like AUC, precision and recall weren’t included here, is that there’s really no point in measuring them. What’s considered a false positive, or a false negative? Does that really change the decision making? We already make a decision (long or short) based upon the indicator predicted, so all that matters is the return. It’s not like we’re trying to detect breast cancer with the classifier, in which case it might be good to have high recall and AUC.

For the future, maybe we can vary the length of time to use for the features in the training model, use contra-indicators (ie. use Walmart as contra-indicator against the market), or use groups of securities in the training model, as described above in the methods.

Stay tuned.