> (most tasks below 1 second, only huge tasks take
> longer than 10 seconds, worst case 13.33s for a satellite task)
I tested what the longest time for creating the successor generators with the
current preprocessor across the Satellite tasks is and got 10.26s (for #33, the
biggest one). But it's possible that this discrepancy is just because I was
testing while the grid wasn't used by any other task. Even if not, I guess we
could survive a 30% slowdown in a part that is generally not time-critical. From
my side, it looks like successor generation creation time is fine.
> Here are the results of the satisficing experiments
Wow, some *huge* differences there. These may mostly be due to different
tie-breaking between operators. Given that overall results are no worse than
before, I don't think it's necessary to look into these results much more deeply
unless someone wants to.
However, we should also test our "go-to" config: LAMA. (Alternatively, given
that I'm not sure the anytime configurations are supported very well by our
experiments: the initial config of LAMA.) This is what we recommend to everyone
as the default, so it should be part of the ones we always test. To make our
lives easier, maybe it's worth adding an alias for this to our downward script,
e.g. "lama-first" or "lama-initial"?
The results with randomized=true vary much less than the non-randomized ones,
and the overall trends look similar to blind search. I think in terms of average
overhead, things look quite fine.
Regarding LM-Cut and changed expansion counts, there are exactly six domains
where the number of reopened states has changed, and these are exactly the
domains where we see a change in the number of expansions until last jump.
That's good enough for me not to look into this further.
Depending on how thoroughly we want to go over these experiments, I see up to
three open points:
1. It would be good to have data for a LAMA-style configuration.
2. The iPDB performance issue is still not solved. It doesn't have to be solved
here, though, as there is a separate issue open for that already.
3. While we do a bit better overall with blind search, there are also some
domains where we become substantially slower (several hundred percent). I
checked three domains that stood out in the scatter plots: airport, mprime and
sokoban-opt11. In mprime, the difference may be due to the (in some cases,
substantially increased) number of evaluations. In the other two domains, the
number of evaluations doesn't change substantially, and I guess it's the
successor generators that are slower. This may be worth looking into. On the
positive side, the opposite also happens. One particularly nice domain is
tidybot, where apparently the successor generators are now much faster (e.g.
1134 seconds => 79 seconds in one task); the number of evaluations isn't changed
significantly here.
Of course, someone should also look over the code. I can try to do this, but I
have quite a pile-up of todos at the moment.
|