Monday, January 29, 2007

Learning Subproblems Additions for the m-TSP

Base Level Problem: Given a set of colocated homogeneous vehicles, a set of targets and their locations and their values, and a fuel constraint on vehicles, execute a set of tours that maximize the total value minus the tour length while satisfying the fuel constraint.

Problem Approach: The problem is hierarchically decomposed into a master level problem and subproblems. The subproblems consist of missions for the vehicles (i.e. the vehicles are assigned a subset of targets and must generate a tour where each target in the subset is visited once without violating the fuel constraint). The subproblems are characterized by a mission value, which is comprised of the value of the targets visited during the mission minus the cost of the tour, and the subset of targets visited. The master level problem is formulated as an binary integer program which selects from the set of subproblem, those with maximize the total value while satisfying the constraint that no target is ever visited more than once.

Meta-level Problem: The meta-level problem is to determine a policy for generating interesting subproblems given the results of the current set of subproblems. Solving subproblems take time. As a first pass the goal of the meta-level planner will be to maximize the following objective function:

sum(Value of Missions Selected) - epsilon*(WallClockTimeForGeneratingSubproblems)


In the future, we plan to apply add windows of opportunity to the problem, such that the targets will have start and end times which specify when they are available to be service by the vehicles. In this situation, a more natural objective function for the meta-level planner will be to just maximize the sum of the mission value. The epsilon weighting factor for the time cost will no longer be explicitly necessary, as the total time taken for solving subproblems will interact with the windows of opportunities of the targets to generate implicit time costs.



The graph to the right is a preliminary result for the case of 2 vehicles and 4 targets located in a 2x2 region with fuel constraint of 4. The target were each valued at 10 each. note the legend is mislabelled. I have plotted on the graph the results of the DTMP, Plan all and a Greedy Strategy for varying values of epsilon. The Plan All policy assumes that all 2^4 or 16 subproblems were generated prior to solving the master level problem. The Greedy Algorithm consisted of alternating vehicles selecting the nearest target to add to its subset. The DTMP result does not perform as well as the plan all strategy for epsilon= 0. I believe that the decision tree which dictates the order of subproblem generation stop short in many cases resulting the what I referred to as the path completion problems in previous posts.


The case for 4 vehicles and 8 targets over a 2x2 region with fuel = 4 is posted below. The poor performance may be due to insufficient sampling in addition to the path completion problem.



I am currently working to correct this problem.