2-20. Overlapping Labels Archives

6 – L6 11 HS Foreshadow V2

2021-08-08 by Dr. Serendipity

So far, in these lessons, you’ve learned a ton about using decision tree models to build high performing trading strategies. You’ve gained insights about the algorithms themselves, how they work on the inside and also the data themselves, how to engineer features to be most useful and how to generate labels. In this lesson, you … Read more

5 – L6 08 HS Ensemble Models Trained On Nonoverlapping Periods V5

2021-08-08 by Dr. Serendipity

Another possibility is based on the observation that if we’re working with weekly returns, there are five different sets of non-overlapping labels we could use, depending on which day of the week we calculate the returns from. The idea is to train separate, random forest models on each set of non-overlapping labels and then ensemble … Read more

4 – L6 06 HS Adjust Bag Size V4

2021-08-08 by Dr. Serendipity

Another possibility involves adjusting the bagging procedure inside the random forest model. The idea is to pick a smaller number of samples for each bag in order to reduce the influence of redundant information. Changing the bag size is not an option within scikit-learn random forest implementations. For these models, the number of rows in … Read more

3 – L6 05 HS Subsample Rows V4

2021-08-08 by Dr. Serendipity

The simplest solution to this problem is to just sub-sample the rows of the original data set so that the labels do not overlap in time. In other words, if you are calculating weekly returns, you would just use the weekly returns of non-overlapping weeks. That is the weekly returns every Friday for example. This … Read more

2 – L6 03 HS The Problem V4

2021-08-08 by Dr. Serendipity

The core problem is that if every day we calculate, say weekly returns, then the weekly return on a given day will be correlated with the weekly returns of the day’s surrounding it because they draw on data from the same time period. This is a problem for lots of Machine Learning models which frequently … Read more

1 – L6 01 HS Intro V2

2021-08-08 by Dr. Serendipity

As we mentioned before, if features are inputs to a supervised learning model, then the targets, also called labels, are the outputs of the model. In the last lesson, you learned how to create several features that may be useful for improving your model’s performance. The features help by providing context about market conditions, which … Read more