2 – Discrete vs. Continuous Spaces

Let us first take a look at what we mean by discrete and continuous spaces. Recall the definition of a Markov Decision Process where we assume that the environment state at any time is drawn from a set of possible states. When the set is finite, we can call it a discrete state space. Similarly with actions, if there is a finite set of them, the environment is said to have a discrete action space. Having discrete spaces simplifies things for us. For starters, it allows us to represent any function of states and actions as a dictionary or look-up table. Consider the state value function V which is a mapping from the set of states to a real number. If you encode states as integers, you can code up the value function as a dictionary using each state as a key. Similarly, consider the action value function Q that maps every state action pair to a real number. Again, you could use a dictionary here or store the value function as a table or matrix, where each row corresponds to a state, and each column to an action. Discreet spaces are also critical to a number of reinforcement learning algorithms. For instance, in value iteration, this internal four loop goes over each state as one by one, and updates the corresponding value estimate V of s. This is impossible if you have an infinite state space. The loop would go on forever even for discrete state spaces with a lot of states this can quickly become infeasible. Model-free methods like Q-learning assume discrete spaces as well. Here, this max is being computed over all possible actions from state S prime which is easy when you have a finite set of actions. But this tiny step itself becomes a full-blown optimization problem if your action space is continuous. So what do we exactly mean by continuous spaces. The term continuous is used to contrast with discrete. That is a continuous space is not restricted to a set of distinct values like integers. Instead it can take a range of values, typically real numbers. This means, quantities like state values that could be depicted as, say a bar chart, for a discrete case, one bar for every state, will now need to be thought of as a a density plot over a desired range. The same notion extends to environments where the state is no longer a single real valued number but a vector of such numbers. This is still referred to as a continuous space just with more than one dimension. Okay, before we go any further, let’s try to build some intuition for why continuous state spaces are important. Where do they even come from? When you consider a high-level decision making task like playing chess, you can often think of the set of possible states as discrete. What piece is in which square on the board. You don’t need to bother with precisely where each piece is located within its square or which way it is facing. Although these details are available for you to inspect and wonder about, why is your knight staring at my queen. These things are not relevant to the problem at hand and you can abstract them away in your model of the game. In general, grid-based worlds are very popular in reinforcement learning. They give you a glimpse at how agents might act in spatial environments. But real physical spaces are not always neatly divided up into grids. There is no cell 5-3 for the vacuum cleaner robot to go to. It has to chart a course from its current position to say 2.5 meters from the west wall by 1.8 meters from the north wall. It also has to keep track of its heading and turn smoothly to face the direction it wants to move in. These are all real numbers that the agent may need to process and represent as part of the state. Actions too can be continuous. Take for example a robot that plays darts. It has to set the height and angle it wants to release the dart at, choose an appropriate level of power with which to throw et cetera. Even small differences in these values can have a large impact on where the dart ultimately lands on the board. In general, most actions that need to take place in a physical environment are continuous in nature. Clearly, we need to modify our representation or algorithms or both to accommodate continuous spaces. The two main strategies we’ll be looking at are Discretization and Function Approximation.

%d 블로거가 이것을 좋아합니다: