Issue 756: Unintended behavior change for reward progress in lazy search

Title	Unintended behavior change for reward progress in lazy search
Priority	feature	Status	resolved
Superseder		Nosy List	jendrik, malte, manuel, mkatz, silvan
Assigned To	manuel	Keywords
Optional summary

Created on 2017-12-20.12:21:57 by mkatz, last changed by manuel.

Messages
msg7337 (view)	Author: manuel	Date: 2018-07-30.14:30:57
I close this issue because most of us agree with keeping the current behavior.
msg7314 (view)	Author: silvan	Date: 2018-07-24.12:40:04
I was interested in getting this issue from my radar :-) I think the differences are too small to add a new option here, and thus also vote for keeping the current behavior.
msg7313 (view)	Author: jendrik	Date: 2018-07-24.12:13:30
I also like Manuel's suggestion.
msg7312 (view)	Author: malte	Date: 2018-07-24.12:11:27
Every option has a cost in maintenance, documentation, understandability, cognitive load and debuggability. I would rather not introduce a new one unless there is a significant gain to be had from that. I like Manuel's suggestion. What do the others think? (For example Silvan, you inquired about this recently, so perhaps you're interested in the status?)
msg7311 (view)	Author: mkatz	Date: 2018-07-24.12:04:24
How about keeping both options, introducing yet another parameter to the search?
msg7310 (view)	Author: manuel	Date: 2018-07-24.11:47:39
Rewarding progress at the initial state does not show significant effects on eager search. For the experiment, I have set the boost value to 1000, because the standard boost value is 0. Configuration issue756-base is without reward at the initial state and issue756-v2 is with a reward at the initial state. http://ai.cs.unibas.ch/_tmp_files/heusner/issue756-v2-eager-issue756-base-issue756-v2-compare.html I suggest keeping the current behavior of lazy search and eager search.
msg7301 (view)	Author: silvan	Date: 2018-07-12.14:17:42
Manuel, any news here?
msg7240 (view)	Author: manuel	Date: 2018-06-07.19:21:32
Eager search does not reward progress at the initial state in the current version of Fast Downward. I suggested in a previous message to also test the behavior of eager greedy on this issue. Let me do the experiments and report the results here.
msg7235 (view)	Author: malte	Date: 2018-06-07.12:49:34
And all other things being equal, I agree Manuel: not rewarding the initial state makes slightly more sense to me than rewarding it. But I think it should be the same in lazy and eager search, and I did not check what currently happens in eager search.
msg7234 (view)	Author: malte	Date: 2018-06-07.12:47:28
I looked at the original papers, and the descriptions are not formal enough to be 100% sure how this case should be handled. For me, either rewarding or not rewarding the initial state would make some sense, but the behaviour should be consistent between lazy and eager search.
msg7228 (view)	Author: silvan	Date: 2018-06-07.10:09:23
So, again, regarding the issue, we should decide whether to keep the current behavior or whether it makes more sense to switch to some more close to the formal definition. Since I don't know about the latter, others who do should speak up :-)
msg7219 (view)	Author: mkatz	Date: 2018-06-06.11:57:10
It looks like the current variant performs slightly better almost everywhere, which is not what I observed in my experiments, but in my experiments there could be additional factors. It might be worth taking another look if/when we decide to integrate red-black heuristics.
msg7218 (view)	Author: manuel	Date: 2018-06-06.10:53:24
Revision issue756-base is without reward at initial state and issue756-v1 is with reward. You find the pull request on bitbucket: https://bitbucket.org/manuel_h/downward/pull-requests/6/issue756/diff
msg7217 (view)	Author: silvan	Date: 2018-06-06.10:52:54
The differences do not seem to be very large. So I think the decision of which variant to prefer boils down to which variant is closer to the definition of progress and to what one would expect.
msg7216 (view)	Author: silvan	Date: 2018-06-06.10:44:43
Which version is which? Can you also please post a pull-request?
msg7215 (view)	Author: manuel	Date: 2018-06-06.09:26:16
The result of the experiment does not show significant differences over different configurations and domains. In my opinion we should not reward progress at the initial state, because all definitions of progress that I have found in literature in context of boosting preferred operators do define progress relative to heuristic estimates of already expanded states. If the generation of the initial state is considered as progress, it should have been defined explicitly. Here you find the comparison tables: http://ai.cs.unibas.ch/_tmp_files/heusner/issue756-v1-issue756-base-issue756-v1-compare.html#summary
msg7155 (view)	Author: manuel	Date: 2018-06-04.12:22:42
I will test and report the behavior of lazy search and ignore the other observations for this issue. We may discuss the other issues offline.
msg7154 (view)	Author: silvan	Date: 2018-06-04.12:15:20
Then please change the reporting order as you think it is best.
msg7153 (view)	Author: manuel	Date: 2018-06-04.12:13:52
The reporting is consistent among search algorithms. I just feel that the flow of reporting progress is broken by reporting the initial heuristic value and the pruning method.
msg7152 (view)	Author: silvan	Date: 2018-06-04.12:09:06
I don't quite understand: is there a difference in the reporting of different searches and you want to make it consistent? If yes, then this is good.
msg7151 (view)	Author: manuel	Date: 2018-06-04.12:06:35
I noticed that progress is also not rewarded in eager_greedy at the initial state. I would test the behavior of rewarding progress at the initial state for eager_greedy as well. Moreover, I noticed that the initial heuristic values as well as the progress methods are reported after reporting the first progress in eager and lazy searches. I would expect both to be reported beforehand. This is a small issue that could be fix within this issue. What do you think?
msg6759 (view)	Author: mkatz	Date: 2017-12-20.12:27:40
Last before issue77 revision: 874838d3625e After issue77 revision: bcb6cef0e11a
msg6758 (view)	Author: malte	Date: 2017-12-20.12:27:32
...or perhaps it's worth also testing cg and cea (both as single heuristics in lazy search, of course with preferred operators), as we expected a behaviour change that can be dramatic, but only for certain heuristics. Or at least that's what Michael saw in his experiments.
msg6757 (view)	Author: malte	Date: 2017-12-20.12:26:34
Someone want to set up an experiment for this? The code change to test this is simple, only a few lines. Michael or I can give pointers to the necessary changes if it helps. This is only relevant for configurations involving lazy search and preferred operators, so I suggest we test lazy search with the FF heuristic and preferred operators, lama-first, and lama (separately).
msg6756 (view)	Author: silvan	Date: 2017-12-20.12:23:08
Moved summary provided by Michael to Change Note.
msg6755 (view)	Author: silvan	Date: 2017-12-20.12:22:51
An introduction of EvaluationContext in issue77 has caused a change in the behavior of the lazy search. Before: reward_progress() would be invoked every time there was a new best heuristic value found, including the initial state. After: reward_progress() is not invoked for the initial state, and otherwise the same. It is not clear which behavior is better, but the change does seem to be unintended and is due to a separate handling of the progress check for the initial state in the newer version.

History
Date	User	Action	Args
2018-07-30 14:30:57	manuel	set	status: reviewing -> resolved messages: + msg7337
2018-07-24 12:40:04	silvan	set	messages: + msg7314
2018-07-24 12:13:30	jendrik	set	messages: + msg7313
2018-07-24 12:11:27	malte	set	messages: + msg7312
2018-07-24 12:04:24	mkatz	set	messages: + msg7311
2018-07-24 11:47:39	manuel	set	messages: + msg7310
2018-07-12 14:17:42	silvan	set	messages: + msg7301
2018-06-07 19:21:32	manuel	set	messages: + msg7240
2018-06-07 12:49:34	malte	set	messages: + msg7235
2018-06-07 12:47:28	malte	set	messages: + msg7234
2018-06-07 10:09:23	silvan	set	status: in-progress -> reviewing messages: + msg7228
2018-06-06 11:57:10	mkatz	set	messages: + msg7219
2018-06-06 10:53:24	manuel	set	messages: + msg7218
2018-06-06 10:52:54	silvan	set	messages: + msg7217
2018-06-06 10:44:43	silvan	set	messages: + msg7216
2018-06-06 09:26:16	manuel	set	messages: + msg7215
2018-06-04 12:22:42	manuel	set	messages: + msg7155
2018-06-04 12:15:20	silvan	set	messages: + msg7154
2018-06-04 12:13:52	manuel	set	messages: + msg7153
2018-06-04 12:09:06	silvan	set	messages: + msg7152
2018-06-04 12:06:35	manuel	set	messages: + msg7151
2018-06-04 10:38:34	manuel	set	status: chatting -> in-progress nosy: + manuel assignedto: manuel
2017-12-20 12:27:40	mkatz	set	messages: + msg6759
2017-12-20 12:27:32	malte	set	messages: + msg6758
2017-12-20 12:26:34	malte	set	messages: + msg6757
2017-12-20 12:23:39	jendrik	set	nosy: + jendrik
2017-12-20 12:23:08	silvan	set	messages: + msg6756
2017-12-20 12:22:51	silvan	set	status: unread -> chatting nosy: + malte, mkatz, silvan messages: + msg6755 summary: An introduction of EvaluationContext in issue77 has caused a change in the behavior of the lazy search. Before: reward_progress() would be invoked every time there was a new best heuristic value found, including the initial state. After: reward_progress() is not invoked for the initial state, and otherwise the same. It is not clear which behavior is better, but the change does seem to be unintended and is due to a separate handling of the progress check for the initial state in the newer version. ->
2017-12-20 12:21:57	mkatz	create

Issue756