Thursday, September 07, 2006

Example of Path Completion Problem

The path completion problem involves cases where the decision tree does not provide enough looks to find a complete path to the goal. Under these circumstances, I find the *optimal* path using the true arc costs of the known arcs, and the worst-case cost of the unknown arcs.


Example
In this example, I show that the path completion problem is present even in the zero-computation cost case. For a particular arc cost instantiation =
[1 1 0 1 1 1 1 0 1 1 0 0 1 0 0 0 ], where 0 means an arc cost of 0 and 1 means a cost of 2, we see that the decision tree recommends looking at arcs 2,1,8,3. The mean predicted cost of th optimal path given the known arc costs is supposed to be 2 with std = 0. The optimal path completion arcs should be 14 and 16.

However, when executing the policy, the path completion algorithm selected arcs 13 and 15 to complete the path. Since neither arcs 13,14,15, nor 16 were known, they were assumed to take their worst-case values. This made the cost of executing 13-15 equivalent to 14-16, when in actuality the latter has lower cost.


Fix
In order to distinguish between good and bad completion paths, I tried assuming expected arc costs for the unknown arcs rather than their worst case costs. This improved the solution cost drastically.

I originally used worst-case costs as estimates of arc costs because I was doing state aggregation using quality levels. The quality estimate would only monotonically increase when using worst-case arc costs. Here I am not performing state aggregation.