10 – Chain Rule

So before we start calculating derivatives, let’s do a refresher on the chain rule which is the main technique we’ll use to calculate them. The chain rule says, if you have a variable x on a function f that you apply to x to get f of x, which we’re gonna call A, and then another function g, which you apply to f of x to get g of f of x, which we’re gonna call B, the chain rule says, if you want to find the partial derivative of B with respect to x, that’s just a partial derivative of B with respect to A times the partial derivative of A with respect to x. So it literally says, when composing functions, that derivatives just multiply, and that’s gonna be super useful for us because feed forwarding is literally composing a bunch of functions, and back propagation is literally taking the derivative at each piece, and since taking the derivative of a composition is the same as multiplying the partial derivatives, then all we’re gonna do is multiply a bunch of partial derivatives to get what we want. Pretty simple, right?

Dr. Serendipity에서 더 알아보기

지금 구독하여 계속 읽고 전체 아카이브에 액세스하세요.

Continue reading