Interesting experiment. Some of the scatter plots have quite curious shapes with
quite different trends for tasks where the amount of time is low vs. tasks where
the amount of time is high, and I find them quite hard to interpret. Perhaps
this is because of the time limits in place for the abstraction generation.
It is interesting to see the performance to drop significantly from
AdaptiveQueue to HeapQueue, and then go back up again significantly by adding
low-h tie-breaking to HeapQueue. This shows that adding the tie-breaking to the
heap queue *is* very important. Why is the tie-breaking heap queue then worse
than the adaptive queue? Is it because of overhead (the adaptive queue being
faster), or is it because of something else?
I think it is actually mainly about something else. The adaptive queue usually
behaves like a bucket queue. I assume there is only a handful of domains
(certainly parcprinter, but perhaps that might be the only domain) where it will
behave like a heap queue instead. Now, our bucket queue uses LIFO tie-breaking,
so it prefers to go deep. This is actually a quite good tie-breaking strategy in
this case: with a perfect heuristic, it's often a much better tie-breaking
strategy than low-h. In uniform-cost domains, they are the same given a perfect
heuristic, but in general-cost domains, they are not.
In particular, in domains where all operator costs are 0 and 1, LIFO
tie-breaking pushes on down one particular path after applying a 0-cost
operator, while low-h tie-breaking wanders around on 0-cost plateaus aimlessly.
In domains like openstacks, LIFO clearly makes a lot more sense. (That isn't to
say it's a universally good tie-breaking strategy; what we would really like is
tie-breaking in favor of low-d, i.e., distance to the goal ignoring action
costs, but of course we don't have that information.)
Going through the detailed results for score_total_time, the low-h tie-breaking
strategy mainly does poorly in domains where the operator costs are 0 and 1. I
think it would be quite interesting to see the results separated out by the
costs used per domain: unit, binary (0 or 1), or general, and it would be
interesting to local at the comparisons between adaptive queue and the
tie-breaking heap queue for these. But that isn't to say that we need to produce
these numbers, just that they would be interesting to see.
In summary, it looks like we currently get good performance because we already
do good tie-breaking almost by accident. If the adaptive queue behaved like a
bucket queue less frequently (e.g. if we multiplied all action costs by 1000 in
all domains), this would no longer be true, and explicit tie-breaking would be
very useful. So we're basically saved here by the fact that action costs tend to
be low in almost all domains.
|