Issue756

Title Unintended behavior change for reward progress in lazy search
Priority feature Status resolved
Superseder Nosy List jendrik, malte, manuel, mkatz, silvan
Assigned To manuel Keywords
Optional summary

Created on 2017-12-20.12:21:57 by mkatz, last changed by manuel.

Messages
msg7337 (view) Author: manuel Date: 2018-07-30.14:30:57
I close this issue because most of us agree with keeping the current behavior.
msg7314 (view) Author: silvan Date: 2018-07-24.12:40:04
I was interested in getting this issue from my radar :-)

I think the differences are too small to add a new option here, and thus also
vote for keeping the current behavior.
msg7313 (view) Author: jendrik Date: 2018-07-24.12:13:30
I also like Manuel's suggestion.
msg7312 (view) Author: malte Date: 2018-07-24.12:11:27
Every option has a cost in maintenance, documentation, understandability,
cognitive load and debuggability. I would rather not introduce a new one unless
there is a significant gain to be had from that. I like Manuel's suggestion.
What do the others think? (For example Silvan, you inquired about this recently,
so perhaps you're interested in the status?)
msg7311 (view) Author: mkatz Date: 2018-07-24.12:04:24
How about keeping both options, introducing yet another parameter to the search?
msg7310 (view) Author: manuel Date: 2018-07-24.11:47:39
Rewarding progress at the initial state does not show significant effects on
eager search.

For the experiment, I have set the boost value to 1000, because the standard
boost value is 0.

Configuration issue756-base is without reward at the initial state and
issue756-v2 is with a reward at the initial state.

http://ai.cs.unibas.ch/_tmp_files/heusner/issue756-v2-eager-issue756-base-issue756-v2-compare.html

I suggest keeping the current behavior of lazy search and eager search.
msg7301 (view) Author: silvan Date: 2018-07-12.14:17:42
Manuel, any news here?
msg7240 (view) Author: manuel Date: 2018-06-07.19:21:32
Eager search does not reward progress at the initial state in the current
version of Fast Downward. I suggested in a previous message to also test the
behavior of eager greedy on this issue. Let me do the experiments and report the
results here.
msg7235 (view) Author: malte Date: 2018-06-07.12:49:34
And all other things being equal, I agree Manuel: not rewarding the initial
state makes slightly more sense to me than rewarding it. But I think it should
be the same in lazy and eager search, and I did not check what currently happens
in eager search.
msg7234 (view) Author: malte Date: 2018-06-07.12:47:28
I looked at the original papers, and the descriptions are not formal enough to
be 100% sure how this case should be handled. For me, either rewarding or not
rewarding the initial state would make some sense, but the behaviour should be
consistent between lazy and eager search.
msg7228 (view) Author: silvan Date: 2018-06-07.10:09:23
So, again, regarding the issue, we should decide whether to keep the current
behavior or whether it makes more sense to switch to some more close to the
formal definition. Since I don't know about the latter, others who do should
speak up :-)
msg7219 (view) Author: mkatz Date: 2018-06-06.11:57:10
It looks like the current variant performs slightly better almost everywhere, which is 
not what I observed in my experiments, but in my experiments there could be additional 
factors. It might be worth taking another look if/when we decide to integrate red-black 
heuristics.
msg7218 (view) Author: manuel Date: 2018-06-06.10:53:24
Revision issue756-base is without reward at initial state and issue756-v1 is
with reward.

You find the pull request on bitbucket:
https://bitbucket.org/manuel_h/downward/pull-requests/6/issue756/diff
msg7217 (view) Author: silvan Date: 2018-06-06.10:52:54
The differences do not seem to be very large. So I think the decision of which
variant to prefer boils down to which variant is closer to the definition of
progress and to what one would expect.
msg7216 (view) Author: silvan Date: 2018-06-06.10:44:43
Which version is which? Can you also please post a pull-request?
msg7215 (view) Author: manuel Date: 2018-06-06.09:26:16
The result of the experiment does not show significant differences over
different configurations and domains.

In my opinion we should not reward progress at the initial state, because all
definitions of progress that I have found in literature in context of boosting
preferred operators do define progress relative to heuristic estimates of
already expanded states. If the generation of the initial state is considered
as progress, it should have been defined explicitly.

Here you find the comparison tables:
http://ai.cs.unibas.ch/_tmp_files/heusner/issue756-v1-issue756-base-issue756-v1-compare.html#summary
msg7155 (view) Author: manuel Date: 2018-06-04.12:22:42
I will test and report the behavior of lazy search and ignore the other
observations for this issue. We may discuss the other issues offline.
msg7154 (view) Author: silvan Date: 2018-06-04.12:15:20
Then please change the reporting order as you think it is best.
msg7153 (view) Author: manuel Date: 2018-06-04.12:13:52
The reporting is consistent among search algorithms. I just feel that the flow
of reporting progress is broken by reporting the initial heuristic value and the
pruning method.
msg7152 (view) Author: silvan Date: 2018-06-04.12:09:06
I don't quite understand: is there a difference in the reporting of different
searches and you want to make it consistent? If yes, then this is good.
msg7151 (view) Author: manuel Date: 2018-06-04.12:06:35
I noticed that progress is also not rewarded in eager_greedy at the initial
state. I would test the behavior of rewarding progress at the initial state for
eager_greedy as well.

Moreover, I noticed that the initial heuristic values as well as the progress
methods are reported after reporting the first progress in eager and lazy
searches. I would expect both to be reported beforehand. This is a small issue
that could be fix within this issue.

What do you think?
msg6759 (view) Author: mkatz Date: 2017-12-20.12:27:40
Last before issue77 revision: 874838d3625e
After issue77 revision: bcb6cef0e11a
msg6758 (view) Author: malte Date: 2017-12-20.12:27:32
...or perhaps it's worth also testing cg and cea (both as single heuristics in
lazy search, of course with preferred operators), as we expected a behaviour
change that can be dramatic, but only for certain heuristics. Or at least that's
what Michael saw in his experiments.
msg6757 (view) Author: malte Date: 2017-12-20.12:26:34
Someone want to set up an experiment for this? The code change to test this is
simple, only a few lines. Michael or I can give pointers to the necessary
changes if it helps.

This is only relevant for configurations involving lazy search and preferred
operators, so I suggest we test lazy search with the FF heuristic and preferred
operators, lama-first, and lama (separately).
msg6756 (view) Author: silvan Date: 2017-12-20.12:23:08
Moved summary provided by Michael to Change Note.
msg6755 (view) Author: silvan Date: 2017-12-20.12:22:51
An introduction of EvaluationContext in issue77 has caused a change in the
behavior of 
the lazy search. 
Before: reward_progress() would be invoked every time there was a new best
heuristic 
value found, including the initial state. 
After: reward_progress() is not invoked for the initial state, and otherwise the
same.

It is not clear which behavior is better, but the change does seem to be
unintended and 
is due to a separate handling of the progress check for the initial state in the
newer 
version.
History
Date User Action Args
2018-07-30 14:30:57manuelsetstatus: reviewing -> resolved
messages: + msg7337
2018-07-24 12:40:04silvansetmessages: + msg7314
2018-07-24 12:13:30jendriksetmessages: + msg7313
2018-07-24 12:11:27maltesetmessages: + msg7312
2018-07-24 12:04:24mkatzsetmessages: + msg7311
2018-07-24 11:47:39manuelsetmessages: + msg7310
2018-07-12 14:17:42silvansetmessages: + msg7301
2018-06-07 19:21:32manuelsetmessages: + msg7240
2018-06-07 12:49:34maltesetmessages: + msg7235
2018-06-07 12:47:28maltesetmessages: + msg7234
2018-06-07 10:09:23silvansetstatus: in-progress -> reviewing
messages: + msg7228
2018-06-06 11:57:10mkatzsetmessages: + msg7219
2018-06-06 10:53:24manuelsetmessages: + msg7218
2018-06-06 10:52:54silvansetmessages: + msg7217
2018-06-06 10:44:43silvansetmessages: + msg7216
2018-06-06 09:26:16manuelsetmessages: + msg7215
2018-06-04 12:22:42manuelsetmessages: + msg7155
2018-06-04 12:15:20silvansetmessages: + msg7154
2018-06-04 12:13:52manuelsetmessages: + msg7153
2018-06-04 12:09:06silvansetmessages: + msg7152
2018-06-04 12:06:35manuelsetmessages: + msg7151
2018-06-04 10:38:34manuelsetstatus: chatting -> in-progress
nosy: + manuel
assignedto: manuel
2017-12-20 12:27:40mkatzsetmessages: + msg6759
2017-12-20 12:27:32maltesetmessages: + msg6758
2017-12-20 12:26:34maltesetmessages: + msg6757
2017-12-20 12:23:39jendriksetnosy: + jendrik
2017-12-20 12:23:08silvansetmessages: + msg6756
2017-12-20 12:22:51silvansetstatus: unread -> chatting
nosy: + malte, mkatz, silvan
messages: + msg6755
summary: An introduction of EvaluationContext in issue77 has caused a change in the behavior of the lazy search. Before: reward_progress() would be invoked every time there was a new best heuristic value found, including the initial state. After: reward_progress() is not invoked for the initial state, and otherwise the same. It is not clear which behavior is better, but the change does seem to be unintended and is due to a separate handling of the progress check for the initial state in the newer version. ->
2017-12-20 12:21:57mkatzcreate