Issue 1041: Relaxed task exploration of landmark factories wrongly reports solvable

Title	Relaxed task exploration of landmark factories wrongly reports solvable
Priority	bug	Status	resolved
Superseder		Nosy List	clemens, jendrik, malte, remo, salome, silvan
Assigned To	remo	Keywords
Optional summary

Created on 2022-02-03.11:16:06 by remo, last changed by clemens.

Messages
msg10554 (view)	Author: clemens	Date: 2022-02-09.10:52:13
We have merged the issue.
msg10544 (view)	Author: malte	Date: 2022-02-08.18:05:51
I had a look at the pull request and left two comments, but this is generally ready to merge. I would like some changes to the change log entry, but if you don't feel confident about this change but would still like to change this now, we can also polish it at a later stage. Editing the change log entries in context is a standard step of our release process anyway.
msg10543 (view)	Author: malte	Date: 2022-02-08.18:04:38
Looks good to me. I would add that another benefit of dropping the assertion is that Zhu & Givan no longer depends on landmarks::Exploration in its own code, which indeed it shouldn't. The base class still does, but this is another thing that should be changed in the long term, so this is also one more step towards untangling the class hierarchy for the landmark factories.
msg10538 (view)	Author: clemens	Date: 2022-02-08.15:47:44
I try to summarize the discussion I had with Malte, Salomé and Remo over Zoom today. Please correct me if I'm wrong or add if there's something missing. We have tried to understand what exactly is happening in the landmarks::Exploration class and the places where its only public function compute_reachability_with_excludes is called. We came to the conclusion that said function computes an approximation of which goals are relaxed-reachable when excluding a set of propositions and/or a set of operators. The result is exact in the absence of conditional effects. In the presence of conditional effects, it overapproximates whether a task is relaxed solvable without the excludes, i.e., it declares more tasks to be solvable than actually are. This was the case already before we opened this issue and the fix proposed by Remo and I, but it is still the case now, because it only disallows the unary operators that achieve the excluded propositions, but not other side-effects of the corresponding original operator. As discussed before, the proposed changes have no influence on the context where causal landmarks are computed, but it changes the behaviour when deciding whether a relaxed task is solvable. According to our analysis, the way the information is used is sound with this approach. It was probably sound already before, but the approximation is more exact now, so it's still a win. Nevertheless, the assertion that originally pointed us to this issue is still problematic, because we still over-approximate whether a task is solvable or not. If we could make sure that the Zhu and Givan landmark factory uses an even weaker approximation, it would still be fine, but we decided against taking that route and instead just remove the assertion. Given that we have put a lot of time into understanding the implementation and found a lot of opportunities to improve code quality, we decided to open two follow-up issues. All in all, the next steps are as follows: 1. Remove the assertion and revise the added documentation in this issue. 2. Open a new issue to remove dead code (i.e., compute_lvl_ops is always false, level_out is always true, exploration is only used for the assertion in Zhu & Givan). 3. Open a new issue to refactor the exploration (i.e., initialize lvl_var inside exploration, compute excluded_op_ids inside exploration).
msg10527 (view)	Author: salome	Date: 2022-02-07.22:23:14
My understanding of the approximation was the following: If an operator achieves a "forbidden" effect x conditionally, then we assume we can achieve all other effects without achieving x, and thus make UnaryOperators for all these other effects. If we still cannot achieve the goal with them, then there is no way to achieve it without also achieving x, i.e. the approximation is sound. But I'm far from sure that this is correct, so I'd be very happy to discuss it live.
msg10519 (view)	Author: malte	Date: 2022-02-07.18:54:44
There may be a conceptual problem with the whole intention of this (not the intention of the fix but the intention of the original code) that perhaps we need to discuss. Assume that we want to answer the question "Is this delete-relaxed task solvable without achieving landmark X?". Not sure if this is exactly the question we're answering here, but I think it's something along those lines. Without conditional effects and axioms, this can be checked in polynomial time for delete-free tasks. With conditional effect (and still without axioms), I think this is an NP-complete question. This was not known at the time this code was written (and is still an unpublished result), so I think the code does something there that is potentially unsound. Perhaps it is sound, and the approximations made in the new code are safe. But I'd be happier if we discussed this live first.
msg10508 (view)	Author: salome	Date: 2022-02-04.15:08:37
I also took another look and agree that it is also correct for (b). The code only behaves differently if excluded_op_ids is empty and excluded_prop isn't; but this isn't the case for (b) since excluded_props is empty. I also took a quick look at the comment and the changelog entry; looks ready to merge for me.
msg10507 (view)	Author: clemens	Date: 2022-02-04.14:54:59
Allright, thanks for taking the time to look into it again. So I take it you still agree this is ready to merge (after adding a changelog entry)? (Btw., I agree the implementation is ugly, Remo and I found a lot of opportunities to improve code quality when searching for the bug.) Does anybody else want to have a look and join the discussion before we merge?
msg10506 (view)	Author: silvan	Date: 2022-02-04.14:50:10
I didn't even look at the contexts where the function is used but just followed your description here, so no. Case (a) still makes sense to me and I verified the code and came to same conclusion. It took me way too long to understand case (b) in the code, but I think you are right. The code is really ugly; using the cost-members of the propositions for these special cases (-1 and -2) really makes it hard to understand what goes on. Anyway, this is not the story of this issue.
msg10503 (view)	Author: clemens	Date: 2022-02-04.13:46:45
To clarify: the first sentence of msg10502 should be: ... realized that the changed function is used in two different contexts: ...
msg10502 (view)	Author: clemens	Date: 2022-02-04.13:41:52
While working on the documentation, Remo and I realized that the function is changed in two different contexts: (a) the function relaxed_task_solvable where we spotted the issue, and (b) the function is_causal_landmarks. In (a), we check whether a relaxed task is still solvable when disallowing all operators that achieve some excluded proposition(s) (i.e., a potential landmark). In (b), we check whether a relaxed task is still solvable when disallowing all operators that have some landmark as precondition. We asked ourselves, whether it is problematic if we additionally exclude the unary operators achieving the excluded propositions in (b) as done after our change. Our conclusion was that it is not problematic. Let's say x is the proposition that is excluded. If it occurs as an operator precondition, then those operators are excluded due to excluded_op_ids and it doesn't matter if we never make their precondition true. If x is not a precondition of any operator, then the relaxed task should be solvable without applying the unary operators achieving x anyway. @Silvan: did you consider this as well? If yes, and we arrived at the same conclusion, we will add the changelog entry and prepare merging. If not, could you maybe have another look and confirm or deny our analysis?
msg10501 (view)	Author: silvan	Date: 2022-02-04.12:19:36
Please add a change log entry, as this is a bug fix.
msg10500 (view)	Author: silvan	Date: 2022-02-04.12:16:33
I agree with the analysis of the experiments and I think this can be merged.
msg10497 (view)	Author: clemens	Date: 2022-02-04.11:23:38
Exactly, I just took the list of relevant landmark configurations from issue987 and removed the ones unaffected by this change. The reports can be found here: https://ai.dmi.unibas.ch/_tmp_files/buechner/issue1041-v1-optimal-compare.html https://ai.dmi.unibas.ch/_tmp_files/buechner/issue1041-v1-satisficing-compare.html In the optimal case, there is virtually no difference, so nothing to discuss there in our (Remo and my) opinion. In the satisficing case, it is nice to see that the change increases the number of landmarks and landmark orderings found in the LAMA configurations. We assume this is the main reason for the overall positive effects on plan quality and numbers of expansions/evaluations. It is also the main reason why more memory is used. The psr-large domain is the main source of increased memory usage in the total sum, which makes sense since there are also the most additional landmarks. So overall this change has little negative effects and the positive ones compensate them. We will add more documentation, but think the experiments indicate that this can be merged soon.
msg10496 (view)	Author: salome	Date: 2022-02-03.21:27:10
Agreed, comments explaining the functionality would be very helpful in this part of the code! Regarding the experiments: you are correct, all generation methods except lm_hm use the Exploration class. In the summary of issue987 (the meta issue about cleaning up the landmark code) we collected interesting configurations to test, as far as I can tell Clemens included all of them except the one using only lm_hm.
msg10493 (view)	Author: silvan	Date: 2022-02-03.17:34:07
I think I understood the problem and the solution and the minimalistic pull-request definitely does exactly what you described and not more. However, I think it would be good to document what you found in the code, maybe in a comment explaining what the whole function does. The current documentation is very limited. Also, I saw that you added experiments, which I think is mandatory here since you changed functionality. I'm not too familiar with all the aliases; but did you make sure that your selection of configurations includes all heuristics and/or landmark generation methods that make use of the exploration class?
msg10490 (view)	Author: remo	Date: 2022-02-03.11:23:22
When checking the solvability of the relaxed task without achieving the excluded landmark, all operators achieving this landmark should be disallowed. The list excluded_op_ids captures those operators that achieve the landmark unconditionally. All unary operators derived from these operators are disallowed. Additionally, if an operator achieves the landmark conditionally only the unary operator containing the landmark effect should be disallowed. However, the code currently only does this when excluded_op_ids is not empty. If a landmark is only achieved with conditional effects, the corresponding unary operators are never disallowed. We fixed this by removing the check whether excluded_op_ids is empty: https://github.com/aibasel/downward/pull/88
msg10489 (view)	Author: remo	Date: 2022-02-03.11:16:06
While looking at issue998, Clemens and I came across an issue regarding how conditional effects are handled. In particular, running the following command ./fast-downward.py --debug ../benchmarks/miconic-fulladl/f10-2.pddl --search "astar(lmcount(lm_factory=lm_zg()))" results in the following Assertion `initial_state[lm.var].get_value() == lm.value \|\| !relaxed_task_solvable(task_proxy, exploration, true, landmark)' failed. We investigated this issue and came to the conclusion that the problem originates in the relaxed_task_solvable function, as it can wrongfully determine the relaxed task to be solvable without achieving the landmark when conditional effects are present. This is because it is currently allowed to apply conditional operators that achieve excluded propositions. The same function is also used in landmark_factory_rpg_sasp, landmark_factory_rpg_exhaust, and landmark_factory_relaxation, where it is used to calculate the level at which variables and operators can be achieved. The induced levels could be incorrect. According to the wiki, RHW landmarks do support conditional effects and for ZG landmarks it says that "we think that they are supported, but this is not 100% sure". We think that currently conditional effects are not handled properly in either.

History
Date	User	Action	Args
2022-02-09 10:52:13	clemens	set	status: testing -> resolved messages: + msg10554
2022-02-08 18:05:51	malte	set	messages: + msg10544
2022-02-08 18:04:38	malte	set	messages: + msg10543
2022-02-08 15:47:44	clemens	set	messages: + msg10538
2022-02-07 22:23:14	salome	set	messages: + msg10527
2022-02-07 18:54:44	malte	set	messages: + msg10519
2022-02-04 15:08:37	salome	set	messages: + msg10508
2022-02-04 14:54:59	clemens	set	messages: + msg10507
2022-02-04 14:50:10	silvan	set	messages: + msg10506
2022-02-04 13:46:45	clemens	set	messages: + msg10503
2022-02-04 13:41:52	clemens	set	messages: + msg10502
2022-02-04 12:19:36	silvan	set	messages: + msg10501
2022-02-04 12:16:33	silvan	set	messages: + msg10500
2022-02-04 11:23:38	clemens	set	messages: + msg10497
2022-02-03 21:27:10	salome	set	messages: + msg10496
2022-02-03 17:34:07	silvan	set	status: in-progress -> testing assignedto: remo messages: + msg10493 nosy: + silvan
2022-02-03 11:23:22	remo	set	messages: + msg10490
2022-02-03 11:16:06	remo	create

Issue1041