Issue271

Title M&S: experiment with what to do when the size limit is hit in bisimulation strategies
Priority wish Status resolved
Superseder Nosy List joerg, malte, raznis
Assigned To malte Keywords
Optional summary

Created on 2011-08-29.11:46:35 by malte, last changed by malte.

Messages
msg3293 (view) Author: malte Date: 2014-08-09.20:01:13
If I recall correctly, we did quite a few experiments back then and the current
state is probably fine. I'm hence marking this as resolved.
msg1963 (view) Author: malte Date: 2011-11-17.12:00:56
I've merged this back to the default repository. We might eventually want to get
rid of some of the more complicated strategies (assuming that they don't turn
out to be particularly useful), but that will be a separate issue.
msg1715 (view) Author: malte Date: 2011-09-01.10:56:48
As a final data point, I also generated data for 1K:

http://www.informatik.uni-freiburg.de/~helmert/exp-issue211-15-eval-abs-d.html
(by domain)
http://www.informatik.uni-freiburg.de/~helmert/exp-issue211-15-eval-abs-p.html
(individual task results)

They confirm the general trends. Based on this data, #1 should probably be the
default technique since it is simple yet effective.

I'll leave this open for now since this might interact with other features we
still need to look at:

 1) whether to initialize by h value (as in DFP) or by goal/non-goal (as in
    Bisim)
 2) whether to bisimulate h value by h value or group by group
 3) merge strategy
msg1704 (view) Author: malte Date: 2011-08-31.11:43:02
Data with 10K limit is ready:
http://www.informatik.uni-freiburg.de/~helmert/exp-issue211-14-eval-abs-d.html
(by domain)
http://www.informatik.uni-freiburg.de/~helmert/exp-issue211-14-eval-abs-p.html
(individual task results)

Executive summary:

- 10K still much better than 200K, as in our paper
- #1/#3/#4 significantly better than #2, as with 200K
- This time, #1 > #3 > #4 in terms of coverage, but again they are
  very close together.
- Looking at runtime, #1 wins (although not hugely so).
msg1703 (view) Author: malte Date: 2011-08-30.18:50:46
First results with DFP-bop and 200K limit here:
http://www.informatik.uni-freiburg.de/~helmert/exp-issue211-13-eval-abs-d.html
(by domain)
http://www.informatik.uni-freiburg.de/~helmert/exp-issue211-13-eval-abs-p.html
(individual task results)

Summary: #1, #3, #4 are very similar to each other in terms of coverage, but #4
wins by one task. #2 much worse. However, #1 is faster, probably to the extent
that it may be preferable to #4. But I didn't look at the data very closely set.

I'll generate data for smaller abstraction limits too (which worked better in
our IJCAI paper).

I had an initial look at one of the domains where #2 is much worse than the
others, Sokoban, and something interesting goes on there at least in the one
case I checked thoroughly: #1/#3/#4 tend to hit situations after a while where
even the very first refinement step exceeds the 200K limit (or at least that's
what it looks like), and since they refuse to do a refinement step in such a
case, this means that they'll abstract down to 1 abstract state, i.e.,
completely forget about all variables considered so far.

This seems to work OK in Sokoban since the final h quality isn't too great
anyway (or at least that's the impression I have), and just forgetting
everything after a while makes the h construction more efficient. Or at least
that's my interpretation of the results. So in a way I'm not sure we should be
happy about the better performance of strategies #1/#3/#4 here. ;-)

I expect that this interacts quite heavily with the merge strategy, with whether
we proceed "group by group" or "h by h", and with whether we initialize by h
initially or by "goal vs. non-goal", so I wouldn't put too much weight on the
results so far. Still, they are not without interest.
msg1698 (view) Author: malte Date: 2011-08-29.11:52:04
Correction: "at_limit" is a better name than "on_limit".

The current behaviour (SKIP_AND_KEEP_GOING) will be the default value.
msg1697 (view) Author: malte Date: 2011-08-29.11:46:34
From an email I sent to Raz and Jörg a while ago:

==========================================================================
** What to do if a split isn't possible due to a size limit?

Some possibilities:

1. Return immediately.
2. Split as far as the size limit allows, then return.
3. Don't split at all; continue with next possible split; return
   after current iteration.
4. Don't split at all; continue with next possible split; do further
   iterations until nothing can be split any more.

Currently we use 4.; not sure if it's a good choice. We mainly do this
because that's what the original pseudo-code by D., F. and P. does,
but since this looks like quite a corner case, I'm not sure if it's
the best choice. The other choices certainly look like they should be
easier to implement and allow for more efficient implementations, so
we should probably revisit this.

My suggestion is to go with 2.: it's more consistent with how the
bucket-based shrink strategies work, and it ensures that when asked to
shrink down to a size limit of 10K, we actually go down to 10K and not
something like 9997. It should also be easier to implement than 2. or 3.
==========================================================================

I'll explore these four possibilities in this issue. The plan is to
define a new parameter for shrink_bisimulation, "on_limit", which
takes four possible values corresponding to the four possibilities
above:

 1. RETURN
 2. USE_UP
 3. SKIP_AND_COMPLETE_ITERATION
 4. SKIP_AND_KEEP_GOING
History
Date User Action Args
2014-08-09 20:01:13maltesetstatus: chatting -> resolved
messages: + msg3293
2011-11-17 12:00:56maltesetmessages: + msg1963
2011-09-01 10:56:49maltesetmessages: + msg1715
2011-08-31 11:43:02maltesetmessages: + msg1704
2011-08-30 19:03:24maltesetnosy: + joerg
2011-08-30 18:50:47maltesetmessages: + msg1703
2011-08-29 11:52:04maltesetmessages: + msg1698
2011-08-29 11:46:35maltecreate