Wednesday, May 17, 2006

Applying Random Plus Tree Actions To Pacman

Initial attempts to introduce the use of random directed actions have been stymied by a bunch of coding errors and mistakes. Unfortunately, the mistakes are costly in that a batch run will take approximately 24 hours to run (mostly due to sampling sequences of random and tree actions). I reduced the number of trials per batch from 250 to 75 after a few days worth of mistakes.

Here are some of the mistakes I've made,
Set the tunnelFlag to 1 when it should have been zer0.
Recursively generating the same data structure eating up memory

One thing I spotted was that the initial tree action was for arc 11 which is at the tail end of the graph. The policy would choose to look at arc 11 an then take a series of random actions, sometimes leading to paths that never included arc 11. I fixed this by determining the path that was closest to being completed based on which arcs had already been looked at and selected look actions corresponding to unexamined arcs on this path.

Another question was why it was choosing to take tree actions even when ghost speed was really high. I found this to be bizarre. I ended up trying to see if the transition probabilities that I had gathered made sense. They did not. When I summed up the probability to transition into the eaten during planning state from all states, I got a large number for the case where the ghosts do not move during computation. This should not be possible as it is impossible to be eaten during planning if the ghosts do not move.

I then realized that I had made an indexing error. Where I was not allocating enough space in my transition matrix for the "number of random actions taken" feature. This probably led to something being overwritten when the number of random actions taken was maximum. It then made it seem that taking the maximum number of random actions was bad, therefore forcing the policy to choose to take a tree action instead.

This was the latest problem discovered. I am running corrected trials now.

0 Comments:

Post a Comment

<< Home