3 – Creating Testing Sets

The next thing to do is create our training, validation, and test sets. This is something you’re probably going to have to do for every network you build, so it’s good practice to do this for every data set you’re using. And I found that a lot of, basically like every different data set has a different way to do this. Just because things are different shapes, and the way that you want to put your data into your network is different. So you end up having to do this in a lot of different ways, many times. So typically with this, you have some split fraction where you define like what’s the fraction of my data that I want to keep in the test set. Most of the time this is usually like 0.8 or 0.9. So here I split the data into a training set for the features, a validation set for the features, and then training for the labels, and validation for the labels. And then from there, split the validation set into a smaller validation set and a test set, and another validation for the labels and test for the labels. If you do this right, these are the shape of the features that you should see. So like the train x, validation x, and test x, should be these sizes.

%d 블로거가 이것을 좋아합니다: