Wednesday, January 25, 2006

Cyclic Nature of Tree Generation Process



These plots were generated for the small graph problem of 6 nodes and 8 arcs. I wanted to investigate whether iteratively generating the decision tree would eventually lead to convergence. As shown on the left this procedure does not converge, but rather cycles periodically. These plots are for a specific epsilon, but I have verified that similar results are seen for other values of epsilon. Rather than sampling a subset of the 256 world states, I used the entire set to generate the next decision tree. Each of these plots used different starting decision trees as the initializer. The initial decision tree is generated randomly by sampling with replacement assuming no computation costs.

In these plots it is seen the that the minimum cost occurs during the transient section. For other values of epsilon this varies.

Thursday, January 12, 2006

DTMDP with Minimum Tree Reinitialization


With previous decision tree results, I alway reinitialized a tree to the zero computational cost tree for each epsilon value. In these results, I store, for each epsilon value, the tree that yields the minimum expected cost. Then for the the next epsilon value, I use the minimum decision tree as the initializer. This yielded better results than reinitializing with the zero computational cost tree for higher ranges of epsilon. There were instances where the best value decreased with increasing iterations (hightlighted in red). Yellow highlightling shows expected value improvement. This was not the case when I reinitialized using the zero cost tree. It is clear that the initialization procedure has an impact on subsequent tree generation.

Data for DTMDP for Various Tree Generation Iterations


This shows the best value achieved by the DTMDP is reached only after one iteration of the algorithm. The highlighted cells in the table indicate an improvement over the previous trial. I tried a range of iterations: 1,5,10, and 20. It appears that the DTMDP solution does not improve after the 20th iteration.

The value of the DTMDP solution is not monotonically decreasing with the number of iterations, but bounces around. The values achieved from bouncing around appear to be within a bounded set, meaning that the various DTMDP averages for a given epsilon take on only a finite set of values.

The entire 2^n graph value states were used to generate new decision trees. The graph configuration used to generate these results was the Simple Graph Problem with 8 arcs and 6 nodes.

My next step will be to determine the mechanics of tree generation and whether there is some manner to force the solution to converge in a monotonic manner.

Monday, January 09, 2006

Preliminary Decision Tree MDP Plot



This is a plot of the results of the decision tree MDP on the Simple Graph Problem(8 arc, 6 nodes), where the arc costs can take the value 0 or 2 with equal probability. The decision tree MDP Algorithm is as follows:
1. Generate initial decision tree with computational cost set to zero.
2. Compress the decision tree into a quality tree which is used by the MDP.
3. From the quality tree get transition statistics and generate an metaplan.
4. From the metaplan, generatea new decision tree.
5. If no over max iteratations, goto 1.


There is instability in the generation the decision trees, where the expected value of the metaplan generated from successive iterations is not monotonically improving. This instability is currently not well understood. But in order to obtain some preliminary results, for each epsilon value, I ran the above algorithm 30 times, and recorded the result which yielded the minimum cost policy. The plot is given here.