Issue279

Title M&S: "somewhat" greedy bisimulation
Priority wish Status resolved
Superseder Nosy List joerg, malte, mkatz, raznis
Assigned To malte Keywords
Optional summary

Created on 2011-09-12.19:21:40 by malte, last changed by malte.

Messages
msg1965 (view) Author: joerg Date: 2011-11-17.12:10:03
Right, that's what we're currently resolving in the context of the ICAPS 
paper. Will have to see later how to treat it in the journal submission.

best,
joerg
>
> ----------
> status: in-progress ->  resolved
>
> _______________________________________________________
> Fast Downward issue tracker<downward.issues@gmail.com>
> <http://issues.fast-downward.org/issue279>
> _______________________________________________________
>
msg1964 (view) Author: malte Date: 2011-11-17.12:05:08
Since this is fully implemented, I'm marking this as closed and merging it back
to the main code. One thing we discussed by email is that I don't currently
agree with how we generalized this to action costs (I think that the decisions
should be based on edge cost plus h value rather than h value alone; that
doesn't make a difference in the unit-cost case, but it does in the general
setting), but I'll leave that for the future.

(Would love to see that resolved before any future publications on this, though.)
msg1778 (view) Author: joerg Date: 2011-09-19.11:13:56
> So my recommendation would be that we should adapt the journal paper text to
> talk about very greedy bisimulation instead.

The journal paper currently introduces both versions. The "soemwhat greedy" one
is required for our positive theory result in Philosophers, so I'd propose to
leave both in.
msg1777 (view) Author: malte Date: 2011-09-18.12:14:24
To wrap this up, it looks like the "somewhat greedy" bisimulation that we
described in the IJCAI paper doesn't perform interestingly differently from
exact bisimulation, while the "very greedy" one that we actually used does.

So my recommendation would be that we should adapt the journal paper text to
talk about very greedy bisimulation instead.

Thoughts?

(Of course the story might be different when we also vary other things in the
bisimulation computation, such as how exactly to do the refinement, but at least
all our current data points into one direction.)
msg1776 (view) Author: malte Date: 2011-09-18.12:11:24
Results for "goal, then cg, then level" are in:

http://www.informatik.uni-freiburg.de/~helmert/exp-issue279-2-eval-abs-d.html
http://www.informatik.uni-freiburg.de/~helmert/exp-issue279-2-eval-abs-p.html

They are generally inferior to the other merge strategies, although we get some
new best results in Elevators (+1) and Trucks (+2) if I checked everything
correctly. No significant new insights from the experiments with this merge
strategy.
msg1775 (view) Author: malte Date: 2011-09-16.18:25:09
Results with the original h^HHH merge strategy are in:

http://www.informatik.uni-freiburg.de/~helmert/exp-issue279-2-eval-abs-d.html
http://www.informatik.uni-freiburg.de/~helmert/exp-issue279-2-eval-abs-p.html

Also with this merge strategy, "somewhat greedy" is very close to "not greedy"
and further away from "really greedy". However, there are now a few more cases
where somewhat greedy works well, and in particular the best overall config in
terms of coverage with this merge strategy is a somewhat geedy one (somewhat
greedy DFP-10K). However, this is not quite as good as the best overall config
with the inverse linear merge strategy we used before, although it is close.

An oracle still would not need the "somewhat greedy" choice.

Comparing this merge strategy ("h^HHH") to the other one ("IJCAI"), the picture
is interesting:

- with the "bisim" strategies, the h^HHH merge strategy gives much better
overall coverage for exact and somewhat greedy and much worse for very greedy.
- with the "DFP" strategies, the differences are smaller and the signs are
exactly the opposite (so h^HHH is better for very greedy).

Of course, as always, a few domains might be responsible for this. In
particular, the cases where we get worse with h^HHH merging can largely be
explained by Gripper.

If we pick the best coverage config per domain, then in 6 cases we must use a
h^HHH merge strategy config, in 8 cases an IJCAI merge strategy config, and in
the other 16 cases either merge strategy can lead to best performance.

I'm still waiting for results on the "goal, then cg, then level" merge strategy
(h^HHH is the related "cg, then goal, then level").
msg1770 (view) Author: malte Date: 2011-09-14.16:37:23
(No) progress update: the queue is very busy right now. It will take a few days
until the experiments will run.
msg1769 (view) Author: joerg Date: 2011-09-14.15:55:04
very interesting, thanks! could you put Michael on the "listening" set 
for this issue as well?

On 14/09/2011 14:31, Malte Helmert wrote:
> Malte Helmert<malte.helmert@unibas.ch>  added the comment:
>
> Results are in:
>
> http://www.informatik.uni-freiburg.de/~helmert/exp-issue279-eval-abs-d.html#toc2
> http://www.informatik.uni-freiburg.de/~helmert/exp-issue279-eval-abs-p.html#toc2
>
> Summary: "somewhat greedy" bisimulation (i.e. what we call greedy bisimulation
> in the paper) gives very similar but slightly worse results to "non-greedy"
> (exact) bisimulation in all combinations. There is no domain where the best
> "somewhat greedy" config is better than the best exact or greedy config. (In
> other words, a per-domain "config oracle" would never need to enable somewhat
> greedy bisimulation.)
>
> To make this a "cross product of options" evaluation, I also investigated some
> options that we didn't investigate before, namely DFP-like options that always
> do greedy bisimulation. Some of these do quite well; not topping any overall
> records, but they would be selected by a config oracle in two domains
> (scanalyzer and tpp) and have solid overall performance.
>
> If this were the end of the story, I think we could discard "somewhat greedy"
> altogether since it's not interestingly different from "exact". However, I
> expect that there is a big influence of merge strategy here. Our current merge
> strategy tends to start with lots of non-goal variables (at least in most
> domains) which means that all abstract h values are the same for a very long
> time, which gives the following behaviour:
>
>   - somewhat greedy bisimulation ignores nothing
>     =>  degenerates to exact bisimulation
>   - greedy bisimulation ignores everything
>     =>  completely throws away all info about abstract state variables until it
>        first hits a goal variable
>
> Both of these behaviours are rather extreme, so I'll now do some experiments
> with other merge strategies to see if they give us a more varied picture.
>
> _______________________________________________________
> Fast Downward issue tracker<downward.issues@gmail.com>
> <http://issues.fast-downward.org/issue279>
> _______________________________________________________
>
msg1766 (view) Author: malte Date: 2011-09-14.14:31:40
Results are in:

http://www.informatik.uni-freiburg.de/~helmert/exp-issue279-eval-abs-d.html#toc2
http://www.informatik.uni-freiburg.de/~helmert/exp-issue279-eval-abs-p.html#toc2

Summary: "somewhat greedy" bisimulation (i.e. what we call greedy bisimulation
in the paper) gives very similar but slightly worse results to "non-greedy"
(exact) bisimulation in all combinations. There is no domain where the best
"somewhat greedy" config is better than the best exact or greedy config. (In
other words, a per-domain "config oracle" would never need to enable somewhat
greedy bisimulation.)

To make this a "cross product of options" evaluation, I also investigated some
options that we didn't investigate before, namely DFP-like options that always
do greedy bisimulation. Some of these do quite well; not topping any overall
records, but they would be selected by a config oracle in two domains
(scanalyzer and tpp) and have solid overall performance.

If this were the end of the story, I think we could discard "somewhat greedy"
altogether since it's not interestingly different from "exact". However, I
expect that there is a big influence of merge strategy here. Our current merge
strategy tends to start with lots of non-goal variables (at least in most
domains) which means that all abstract h values are the same for a very long
time, which gives the following behaviour:

 - somewhat greedy bisimulation ignores nothing
   => degenerates to exact bisimulation
 - greedy bisimulation ignores everything
   => completely throws away all info about abstract state variables until it
      first hits a goal variable

Both of these behaviours are rather extreme, so I'll now do some experiments
with other merge strategies to see if they give us a more varied picture.
msg1765 (view) Author: malte Date: 2011-09-12.19:21:39
Our current "greedy bisimulation" in the code is actually more greedy than
described in the paper. It ignores transitions that lead away from the abstract
goal (as in the paper) but also transitions that leave the abstract goal the
same (unlike the definition in the paper).

I'll leave the current add a third option for the "greedy" parameter,
"greedy=somewhat" (in addition to "greedy=false" and "greedy=true") that
corresponds to the definitions in the paper and evaluate it.
History
Date User Action Args
2011-11-17 12:15:24maltesetstatus: chatting -> resolved
2011-11-17 12:10:03joergsetstatus: resolved -> chatting
messages: + msg1965
2011-11-17 12:05:08maltesetstatus: in-progress -> resolved
messages: + msg1964
2011-09-19 11:13:56joergsetmessages: + msg1778
2011-09-18 12:14:24maltesetmessages: + msg1777
2011-09-18 12:11:25maltesetmessages: + msg1776
2011-09-16 18:25:09maltesetmessages: + msg1775
2011-09-14 16:37:23maltesetmessages: + msg1770
2011-09-14 15:59:59mkatzsetnosy: + mkatz
2011-09-14 15:55:05joergsetmessages: + msg1769
2011-09-14 14:31:40maltesetmessages: + msg1766
2011-09-12 19:21:40maltecreate