10 – Summary

In summary, here’s what you learned in this lesson. Traditional reinforcement learning techniques use a finite MDP to model an environment which limits us to environments with discrete state and action spaces. In order to extend our learning algorithms to continuous spaces, we can do one of two things. Discretize the state space or directly try to approximate desired value functions. Discretization can be performed using a constant grid, tile coding, or course coding. This indirectly leads to an approximation of the value function. Directly approximating a continuous value function can be done by first, defining a feature transformation and then computing a linear combination of those features. Using non-linear feature transforms like radial basis functions, allows us to use the same linear combination framework to capture some non-linear relationships. In order to represent non-linear relationships across combinations of features, we can apply an activation function. And this sets us up to use deep neural networks for reinforcement learning.

%d 블로거가 이것을 좋아합니다: