51 – M7L7 51 Shap One Solution V2 (1)

Okay. So now let’s fill in this starter code for shap_ feature_ I. First S list, is generate all subsets of S and put it into S list. So if we go back up, and in our book we can see where we defined the function for that. So this generate all subsets, and it takes in a set of features. So now we want to note that the list of S, remember S, if we look at the formula, is all the subsets where they’re excluding feature I. Okay. So, to exclude feature I, we want to use all features minus I in here. So we don’t want to use all features. Now let’s initialize phi. So phi, recall this formula, phi is going to be storing the sum of all these terms, all these products. And so, we will initialize it to zero. The total number of features is just the length of this all features, dictionary or set actually. Now let’s iterate through all the subsets S. So when we iterate, think of this as, hear in the summation is going through all the subsets S that exclude feature I. Okay. So, we’ll come back to this in a bit, but first, phi is going to be adding up all those products. So if I go back to the formula, it’s going to be adding up the weight times the marginal contribution, and both of these are already defined as functions. So let’s go look for that. Okay. So we have the weight on the marginal contribution as well as the marginal contribution of feature. So I’m going to go ahead and save both of these on the clipboard. So there’s the weight, and there’s also the marginal contribution. Let’s go back down to our function. So here, we have the weight on the marginal contribution times the marginal contribution of the feature. So now let’s update these arguments. So, remember size S is the size of that subset S. One thing to know is that we use the keyword, none to represent the empty set. And so, even though if we used the length function to calculate the length of a set, it will count none as one incremental value. So we want to exclude that. So what I mean is that we want the size S to be the length of S minus one in the case when none is part of that subset. Otherwise, the size S could just be the length of S. So that’s this right here. Now instead of M, we’re just getting the number of total features. That’s what M stands for. Then, so for here, the marginal contribution, the feature, the tree, we just reuse the tree that we’re passing here, X we also reuse. So let me put this on a new line. S, this subset S is what we got from here, and feature I is what we got from up here. Okay. Let’s just put this back on one line. Okay. So let’s try out the function. And you can see here that if we calculate the feature importance of feature zero, then we get 0.375. This may look familiar from the previous exercise that we did where we also calculated 0.375. All right, thanks for going through this and please continue on with the lesson.

Dr. Serendipity에서 더 알아보기

지금 구독하여 계속 읽고 전체 아카이브에 액세스하세요.

Continue reading