[SOLVED] CS5225 Project 2-Monte Carlo Prediction and Control

35.00 $

Category:

Description

Rate this product

In this project, you will be asked to implement two model-free algorithms. The first one is Monte-Carlo(MC), including the first visit of on-policy MC prediction and on-policy MC control for blackjack. The second one is Temporal-Difference(TD), including Sarsa(on-policy) and Q-Learning(off-policy) for cliffwalking.

TA will run your code twice. You will get full credits if one of the tests passes.

Hints

  • On-policy first visit Monte-Carlo prediction

  • On-policy first visit Monte-Carlo control

  • Sarsa (on-policy TD control)

  • Q-learing (off-policy TD control)