1-8-4. MC Prediction – Part 1

2021-08-09 by Dr. Serendipity

4 – L604 MC Prediction Part 1RENDER V2

Important Note

In this video, we demonstrated a toy example where the agent collected two episodes, consolidated the information in a table, and then used the table to come up with a better policy. However, as discussed in the previous video, in real-world settings (and even for the toy example depicted here!), the agent will want to collect many more episodes, so that it can better trust the information stored in the table. In this video, we use two episodes only to simplify the example.

1-8-4. MC Prediction – Part 1

Important Note

이것이 좋아요:

댓글 남기기응답 취소

Important Note

이 글 공유하기:

이것이 좋아요:

댓글 남기기응답 취소

Dr. Serendipity에서 더 알아보기