Issue880

Title CEGAR: remove transitions in batches
Priority wish Status resolved
Superseder Nosy List jendrik, malte
Assigned To jendrik Keywords
Optional summary

Created on 2018-12-15.17:06:47 by jendrik, last changed by malte.

Messages
msg8399 (view) Author: malte Date: 2018-12-19.22:56:23
I'm not sure it's that much of a bottleneck (in the experimental results finding
the traces looks much more expensive), but if it is, you can probably do better
than replacing one O(n) algorithm by something better with appropriate data
structures. For example, you could probably do it in O(1) time with a lazy
approach that only replaces outdated transitions once they are encountered. With
a more holistic look at the operations required in the CEGAR loop, perhaps it
would also not be necessary to store the transitions in both directions.
msg8395 (view) Author: jendrik Date: 2018-12-19.10:47:54
Thanks for reviewing the code and the experimental results! I agree that I
should have pointed out the negative trends in more detail. The diff greatly
speeds up the rewiring process, which was (and still is for many tasks) the main
bottleneck of the CEGAR loop. From my point of view, this is very nice to have,
even if one of the landmarks+goals configurations solves 5 fewer tasks. Now that
the rewiring takes less time, I think it allows us to look into speeding up the
computation of traces in a separate issue. 

For completeness, using a single call to unordered_set::insert() instead of
count()+insert() speeds up rewiring a little bit:
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue880-v2-issue880-v1-issue880-v2-compare.html

I merged the issue branch.
msg8384 (view) Author: malte Date: 2018-12-18.15:07:11
I left some comments on bitbucket.

Regarding the experimental results, the analysis comes across as a bit biased.
It only gives the good news and doesn't mention that there is also bad news. In
general, it would be great if a basic assessment of an issue could be done based
on what is mentioned in the issue tracker, without having to dig into the
detailed results. This could make many reviewing tasks much faster. But for this
the description would have to be fair, and it looks a bit cherry-picky to me.
(That can very easily happen accidentally; no complaints!)

To illustrate the point, let me focus on the negative outcomes instead:

With the same argument ("it performs worse") with which we discard the two 10M
configurations and focus only on the 1M results, we could instead discard the
two "original" configurations and focus only on "landmarks-goal". That argument
would perhaps be even more valid than the one for discarding the 10M results
because the amount by which "landmarks-goal" outperforms "original" is much
larger than the amount by which 1M outperforms 10M.

The good 1M configuration outperforms 10M by only 2 tasks -- not enough evidence
that increasing the abstraction size is a bad idea in general. After all, if you
increase the amount of time/memory for something, a factor of 10 is a large
jump. 1M might very well be too low even if 10M is too high. If we see good
results (+2 coverage) for landmarks-goal-1M and bad results (-5 coverage) for
landmarks-goal-10M, do we hurt or help our good landmarks-goal config?

Beyond coverage, if we focus on the two landmarks-goal configs, we see an
overall loss in all scores other than total time. The overall time for finding
flaws and for finding traces increases, rather than decreasing. None of these is
alarming, but is it not worth mentioning? Heuristic quality becomes generally
worse, also for the best overall config (with 1M). Given how close the
landmarks-goal 1M and 10M configs are in the baseline and given that both of
them suffer in heuristics quality in this change and the change negatively
affects coverage overall, I don't think it looks like such a clear win.

I think the analysis should also mention that there is an effect on heuristic
quality, and that the effect is negative for the two best configurations. From
just the issue discussion alone, one could easily come to the wrong conclusion
that this is just a data structure improvement that does not affect what is
computed, only how fast. But that's not the case; we also change the heuristic
because the transitions end up in a different order than before. (The
description mentions that, but from "just" reading our CEGAR papers it is not
clear why this would affect the heuristic values. I assume it is because we find
different abstract plans depending on the ordering of transitions, as we don't
use any rigorous tie-breaking strategy there.)


Of course, all this is playing devil's advocate, and I've overemphasized the
negatives. It is meant mainly as a meta-comment on which points in a result
analysis help get a quick review. :-)

Weighing both the positives and the negatives, to be honest, I don't think the
experiments give all that much support for making the change, but there is also
not much reason to *not* make the change. So from my side, feel free to merge this.

More generally, it looks like there are still some interesting algorithm and
data structure considerations for the CEGAR loop that might be worth looking
into at some point if you're interested. If the "total_time_for_..." numbers in
the experiments include all relevant bottlenecks, it might be well worth
considering if we could cut the time for finding traces down somewhat. But then
I'm not sure how much it matters in the grander scheme of things where other
aspects such as cost partitioning come into play.
msg8367 (view) Author: jendrik Date: 2018-12-16.11:30:48
The experiment results are in:

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue880-v1-issue880-base-issue880-v1-compare.html

I evaluate both a single Cartesian abstraction (original) and the
landmarks+goals additive variant. For both variants the experiment includes a
version with at most 1M transitions and a version with at most 10M transitions.
I included the latter to see whether it makes sense to raise the default of 1M
transitions, but using 10M leads to solving fewer tasks with and without the
patch. I'll therefore focus on results for 1M transitions.

The time for building an abstraction for each landmark and goal decreases
somewhat, but the effect is small since it doesn't take long to build these
abstractions in the first place:

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue880-v1-init_time-cegar-landmarks-goals-1M-issue880-base-issue880-v1.png

The effect is much stronger when constructing a single abstraction:

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue880-v1-init_time-cegar-original-1M-issue880-base-issue880-v1.png

The trend is mainly due to the fact that splitting a state (and rewiring the
transitions) is faster:

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue880-v1-total_time_for_splitting_states-cegar-original-1M-issue880-base-issue880-v1.png

Since the patch removes all old transitions from each affected vector in one go,
these vectors are sorted differently now. Interestingly, this usually makes
finding traces faster:

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue880-v1-total_time_for_finding_traces-cegar-original-1M-issue880-base-issue880-v1.png

This effect is unexpected, but I won't complain about it :-)

Malte, do you want to have a look at the patch, or should I just merge it?
msg8366 (view) Author: jendrik Date: 2018-12-15.19:40:56
I made a pull request at
https://bitbucket.org/jendrikseipp/downward/pull-requests/113 . Experiments are
running.
msg8362 (view) Author: jendrik Date: 2018-12-15.17:06:47
When rewriring states during a refinement step, we currently do a linear search
for each old transition separately. Deleting all old transitions in one step
from each vector of transitions has been shown to speed up the refinement loop
in preliminary experiments, which is why I'd like to integrate this
functionality into the master repo.
History
Date User Action Args
2018-12-19 22:56:23maltesetmessages: + msg8399
2018-12-19 10:47:54jendriksetstatus: reviewing -> resolved
messages: + msg8395
2018-12-18 15:07:11maltesetmessages: + msg8384
2018-12-16 11:30:48jendriksetstatus: chatting -> reviewing
messages: + msg8367
2018-12-15 19:40:57jendriksetstatus: unread -> chatting
messages: + msg8366
2018-12-15 17:06:47jendrikcreate