Name: CS 7641 Assignment 4: Markov Decision Processes and Reinforcement Learning Solved
SKU: 45064
Availability: InStock

Description

5/5 - (2 votes)

This project seeks to understand the three reinforcement learning algorithms by applying them each to two different Markov decision processes (MDP). The reinforcement learning methods are value iteration, policy iteration, and Q-learning. The two MDP toy problems are inspired by Pacman! There is a small 5×5 grid world, and a large 20×20 grid world.

For each grid, Pacman (our learning agent) starts in the top left corner and attempts to navigate his way to the goal by collecting a high score along his journey. Like the real game, Pacman has the opportunity to earn points by eating pellets and fruit, but he must avoid hitting the ghost at all costs. The reward structure for each grid world is represented by:

Small pellets (S) = +1 point
Medium fruit (M) = +2.5 points
Large ghosts (L) = -50 points
Reaching the goal = +100 points
Every step = -5 points to encourage Pacman to reach his goal quickly.

[SOLVED] CS 7641 Assignment 4: Markov Decision Processes and Reinforcement Learning

If Helpful Share:

Description

Related products

CS7642 Homework #1Finding an Optimal Policy

CS7642 – Homework #1

CS7642 Homework​ ​#4Q-Learning

Related in this category

More in this category

CS 7642: Homework #3 Reinforcement Learning and Decision Making

CS7642 Homework #3 Winter is coming…

Homework #1 Finding an Optimal Policy

CS7642 – Project #1

CS7642 -Homework #1 Finding the Optimal State-Value Function

CS7642 – Homework #2 TD(λ)

CS7642 Homework #4Q-Learning