Issue884

Title CEGAR: use h value for tie-breaking in A* search
Priority wish Status resolved
Superseder Nosy List jendrik, malte, silvan
Assigned To jendrik Keywords
Optional summary

Created on 2018-12-19.18:18:00 by jendrik, last changed by jendrik.

Messages
msg8426 (view) Author: jendrik Date: 2019-01-01.13:41:54
The changes from v1 and v2 are merged now.
msg8424 (view) Author: jendrik Date: 2018-12-28.12:51:49
Interesting analysis!

One minor comment regarding your first paragraph: there is no time limit for the
abstraction generation, only a maximum number of state-changing transitions.

I'll merge the changes from v1 and v2 and discard the changes to the open list
as proposed below.
msg8419 (view) Author: malte Date: 2018-12-23.15:11:34
Interesting experiment. Some of the scatter plots have quite curious shapes with
quite different trends for tasks where the amount of time is low vs. tasks where
the amount of time is high, and I find them quite hard to interpret. Perhaps
this is because of the time limits in place for the abstraction generation.

It is interesting to see the performance to drop significantly from
AdaptiveQueue to HeapQueue, and then go back up again significantly by adding
low-h tie-breaking to HeapQueue. This shows that adding the tie-breaking to the
heap queue *is* very important. Why is the tie-breaking heap queue then worse
than the adaptive queue? Is it because of overhead (the adaptive queue being
faster), or is it because of something else?

I think it is actually mainly about something else. The adaptive queue usually
behaves like a bucket queue. I assume there is only a handful of domains
(certainly parcprinter, but perhaps that might be the only domain) where it will
behave like a heap queue instead. Now, our bucket queue uses LIFO tie-breaking,
so it prefers to go deep. This is actually a quite good tie-breaking strategy in
this case: with a perfect heuristic, it's often a much better tie-breaking
strategy than low-h. In uniform-cost domains, they are the same given a perfect
heuristic, but in general-cost domains, they are not.

In particular, in domains where all operator costs are 0 and 1, LIFO
tie-breaking pushes on down one particular path after applying a 0-cost
operator, while low-h tie-breaking wanders around on 0-cost plateaus aimlessly.
In domains like openstacks, LIFO clearly makes a lot more sense. (That isn't to
say it's a universally good tie-breaking strategy; what we would really like is
tie-breaking in favor of low-d, i.e., distance to the goal ignoring action
costs, but of course we don't have that information.)

Going through the detailed results for score_total_time, the low-h tie-breaking
strategy mainly does poorly in domains where the operator costs are 0 and 1. I
think it would be quite interesting to see the results separated out by the
costs used per domain: unit, binary (0 or 1), or general, and it would be
interesting to local at the comparisons between adaptive queue and the
tie-breaking heap queue for these. But that isn't to say that we need to produce
these numbers, just that they would be interesting to see.

In summary, it looks like we currently get good performance because we already
do good tie-breaking almost by accident. If the adaptive queue behaved like a
bucket queue less frequently (e.g. if we multiplied all action costs by 1000 in
all domains), this would no longer be true, and explicit tie-breaking would be
very useful. So we're basically saved here by the fact that action costs tend to
be low in almost all domains.
msg8414 (view) Author: jendrik Date: 2018-12-22.23:58:39
Thanks for the hints! I ran the suggested experiment:

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue884-v4.html

The three configs are, from left to right:

issue884-v2-cegar-original-2M:            AdaptiveQueue w/o tiebreaking
issue884-v2-heap-queue-cegar-original-2M: HeapQueue w/o tiebreaking
issue884-v4-cegar-original-2M:            HeapQueue with tiebreaking

Here are scatter plots comparing the time for finding traces:

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue884-v4-total_time_for_finding_traces-cegar-original-2M-issue884-v2-issue884-v2-heap-queue.png
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue884-v4-total_time_for_finding_traces-cegar-original-2M-issue884-v2-issue884-v4.png
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue884-v4-total_time_for_finding_traces-cegar-original-2M-issue884-v2-heap-queue-issue884-v4.png

The first and second plot show that switching from an AdaptiveQueue without
tiebreaking to a HeapQueue without or with tiebreaking usually increases the
time for finding traces significantly.

The third plot is probably the most interesting one since it allows us to
evaluate the usefulness of tiebreaking. I think the results are a bit
underwhelming. For the tasks where finding traces takes more than 100 seconds
there are about as many tasks that benefit from tiebreaking as there are tasks
that are affected negatively by tiebreaking. The difference to the version
without tiebreaking is at most a factor of ~5 in both directions.

Given these results, I don't think trying to come up with an adaptive
tiebreaking queue has a high chance of speeding up things and would probably be
quite some work. Rather than going down this path, I'd like to try out avoiding
the A* search altogether as we started discussing in issue883.
msg8412 (view) Author: malte Date: 2018-12-22.03:34:34
> I think the first two changes could be merged in any case, as they separate
> the A* search code which is needed while the abstraction is built and is run
> in every iteration of the CEGAR loop from the Dijkstra code that is only
> used twice after each abstraction has been built. Do you agree?

They look good to me, though it wouldn't hurt to get a second opinion before
merging.

> Could you please have a look at the pull request or the commit for v3 and tell
> me if you see any way of speeding up the tiebreaking code?

The open queue implementation is not efficient enough. Maps are very slow, and
there is also a big overhead for using vectors as keys when you only need pairs.

I assume this is modelled after our TiebreakingOpenList class (or whatever it is
called exactly). This class uses a vector because it needs to support an
arbitrary number of tie breakers, and it uses a map because it was convenient to
implement at the time and all this is not (or at least not all that)
performance-critical.

In the overall search, the open list costs are dwarfed by all the other
per-state processing costs, like calling successor generators, evaluating
heuristics, computing and storing the states, etc. In an explicit graph search
context like here where a "state" is just an int, this balance changes.

For more efficient open lists, check the ones in priority_queues.h. The
heap-based one can easily be generalized to support keys that are pairs. The
bucket-based one is a bit more work, so I would suggest the following experiment:

1. Create a heap-based tie-breaking priority queue based on the one in
priority_queues.h.
2. As a variant of the *current* (non-h-tie-breaking code), replace
AdaptiveQueue with HeapQueue.

Compare these two versions with the original version. Then you can see how much
of the overhead is due to not using the bucket-based queue that AdaptiveQueue
will end up using in most cases (e.g. in all unit-cost domains) and how much of
the overhead is due to the tie breaking. Depending on the outcome, we can then
see if it is worth investigating bucket-based tie-breaking queues.
msg8409 (view) Author: jendrik Date: 2018-12-21.23:05:33
I took a stab at this and made the following changes:

v1: Split astar_search() method into astar_search() and dijkstra_search().
v2: Move dijkstra_search() out of AbstractSearch class.
v3: Use h values for tiebreaking in astar_search().

Here are the corresponding experimental results:

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue884-v3-issue884-base-issue884-v1-compare.html
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue884-v3-issue884-v1-issue884-v2-compare.html
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue884-v3-issue884-v2-issue884-v3-compare.html

The first two changes have almost no effect, which was expected since they were
both only made to make the code more modular and prepare for the third change.
The third change itself slows down the A* search significantly and I think the
slowdown is too big for this version of the code to be merged.

Here is the pull request and the individual changesets:

https://bitbucket.org/jendrikseipp/downward/pull-requests/117

v1:
https://bitbucket.org/jendrikseipp/downward/commits/5debf72e08e3a6a4ed6092414a67ecb281aa007d
v2:
https://bitbucket.org/jendrikseipp/downward/commits/497d12693c176848c9524aeaa9a3da6c2f73c6a2
v3:
https://bitbucket.org/jendrikseipp/downward/commits/9a0969896fa99b4a2bcc813eb5c7b63f0a33c39e

I think the first two changes could be merged in any case, as they separate the
A* search code which is needed while the abstraction is built and is run in
every iteration of the CEGAR loop from the Dijkstra code that is only used twice
after each abstraction has been built. Do you agree?

Could you please have a look at the pull request or the commit for v3 and tell
me if you see any way of speeding up the tiebreaking code?
msg8398 (view) Author: jendrik Date: 2018-12-19.18:18:00
I'd like to make the CEGAR A* search faster and more robust by using h values as
tie-breakers.
History
Date User Action Args
2019-01-01 13:41:54jendriksetstatus: reviewing -> resolved
messages: + msg8426
2018-12-28 12:51:49jendriksetmessages: + msg8424
2018-12-23 15:11:34maltesetmessages: + msg8419
2018-12-22 23:58:39jendriksetmessages: + msg8414
2018-12-22 03:34:34maltesetmessages: + msg8412
2018-12-21 23:05:33jendriksetstatus: unread -> reviewing
messages: + msg8409
2018-12-19 18:18:48silvansetnosy: + silvan
2018-12-19 18:18:00jendrikcreate