Issue1072

Title Landmark Count ignores conjunctive landmarks for preferred operators
Priority bug Status in-progress
Superseder Nosy List clemens, jendrik, malte, salome, silvan, simon
Assigned To Keywords
Optional summary
Part of issue987.

Created on 2022-12-21.13:51:07 by clemens, last changed by salome.

Summary
Part of issue987.
Messages
msg11534 (view) Author: salome Date: 2024-01-22.17:09:06
In general the results look good to me. For lama it seems that the overhead is on average at around 2-3% of search time, which is of course not ideal but also not too tragic in my opinion. What I find interesting is that in miconic-fulladl the new version is consistently slightly faster (around 10%), while having the same amount of expansions.
For the conjunctive configuration, we lose 3 tasks (in caldera-split, snake and visitall-sat11), most notably in snake we lose a task which the base version solved in only 0.9 seconds. I assume that we are just really unlucky and send the search in the "wrong" direction right at the beginning, but it might be worth taking a look at the task to see what is going on.

One optimization Clemens and I discussed is that we could add a flag to the landmark graph whether or not the graph contains conjunctive landmarks at all. We could then check this flag before we check if there is a conjunctive landmark containing a specific fact. As Clemens pointed out that check might be expensive, and my hope is that a general "contains_conjunctive_landmarks" flag for the graph (which is only set once in the beginning) would work much better for branch prediction. However, it makes the code slightly more ugly in my eyes so I'm not sure if it's worth doing that just to (potentially) regain some of the very small performance loss. @Malte do you have an opinion/hunch here?

The percentage estimates above are guessed from the following scatterplot comparing lama search time base vs v4:
https://ai.dmi.unibas.ch/_visualizer/?xattribute=search_time&yattribute=search_time&properties_file=https%3A%2F%2Fai.dmi.unibas.ch%2F_experiments%2Fai%2Fdownward%2Fissue1072%2Fdata%2Fissue1072-v5-single-eval%2Fproperties.tar.xz&entries_list=issue1072-base-lama-first+issue1072-v5-lama-first+lama&y_range=%5B0.7106740898757299%2C+11.0%5D&x_range=%5B0.009000000000000001%2C+11000.0%5D&relative=True&autoscale=True&xsize=1000&ysize=1000&groupby=domain
msg11526 (view) Author: clemens Date: 2024-01-18.11:22:46
The suggested change actually brought down the number of expansions back to the ballpark of not preferring operators based on conjunctive landmarks. Now, there are several domains where considering them actually reduces the number of expansions quite abit (e.g., Caldera-split, Nomystery, and Sokoban). In general, the change doesn't seem to affect the configuration much, though. Time scores become a bit worse, expansion/evaluation/generation scores slightly better. See https://ai.dmi.unibas.ch/_experiments/ai/downward/issue1072/data/issue1072-v5-single-eval/issue1072-v5-single-issue1072-base-issue1072-v5-compare.html for the results of a single-search configuration (I didn't run anytime search on v5 yet).

Note that the last change also affects which operators we consider preferred for disjunctive and simple landmarks. The results indicate that it doesn't affect lama-first.

I don't see anything too worrying in the experiment report. Since the change closes a gap in the definition of preferred operators for landmarks, I think it would be a good extension of the code. There is also no clear benefit, though, so I'm not sure how to proceed. I could run the anytime experiment to have some more data if people would be interested in that.
msg11525 (view) Author: clemens Date: 2024-01-17.11:48:12
I ran the experiments, here are the results:
https://ai.dmi.unibas.ch/_experiments/ai/downward/issue1072/data/issue1072-v4-single-eval/issue1072-v4-single-issue1072-base-issue1072-v4-compare.html
https://ai.dmi.unibas.ch/_experiments/ai/downward/issue1072/data/issue1072-v4-anytime-eval/issue1072-v4-anytime-issue1072-base-issue1072-v4-compare.html

LAMA, which is mainly there to make sure nothing breaks in the non-conjunctive case, doesn't seem to be affected much. It appears to be a bit slower in general, which kinda makes sense because when computing preferred operators there is now an additional check for every effect of applicable operators whether there are conjunctive landmakrs containing it. Even if there are never conjunctive landmarks in LAMA (which uses lm_rhw()), this might not be predictable by branch predictions or optimizations as such and therefore checking this over and over again adds up. I didn't verify this hunch, though, and it might as well be something else.

Regarding the configuration which actually uses conjunctive landmarks: Doing so seems to have mixed effects depending on the domain. Schedule stands out with significantly worse number of expansions/evaluations. I will look into that and in particular check whether it might have to do with what I mentioned in my last message: Considering an operator preferred when it achieves a part of the conjunction which already holds might be a bad idea.
msg11522 (view) Author: clemens Date: 2024-01-16.11:15:02
The landmark code has changed a lot since we've been working on this issue. Since we got rid of the mk_acyclic_graph function, I'm curious how this looks today. Because of the many changes, it wouldn't have been trivial to merge the current main branch into the existing pull request, so I decided to copy the changes from the old one into a new, up-to-date branch. You can find a pull request here: https://github.com/aibasel/downward/pull/210

It is not obvious to me from the archived experiments what data we're interested in. Probably most interesting is the effect it has to consider preferred operators for satisficing heuristics with conjunctive landmarks, e.g. something like lama-first but with lm_hm(m=2) instead of lm_rhw(). Besides that, we need to make sure that the changes don't affect recommended configurations like LAMA negatively. I will setup experiments for these to see results of the current version in an updated codebase.

One thing I thought about when migrating the changes was the following: Operators are preferred if they achieve any atom of a conjunctive future landmark, even if that atom already holds in the state where this is checked. However, in those cases they are not really preferrable because the landmark can be achieved without applying them. It might be worth checking whether the operator actually adds a part of the conjunctive landmark that does not yet hold and only prefer those operators. The same check could be done for disjunctive and simple landmarks, though: If an atom holds, then it makes no sense to prefer an operator that adds it, even if the corresponding landmark is future, because it means we need to go to a state where the atom does not hold before adding it again to actually achieve the future landmark, and only in the state where the atom doesn't hold, operators that add it should be preferred.
msg11031 (view) Author: malte Date: 2023-02-21.16:45:04
I'm not so worried about what happens in mk_acyclic_graph.

The whole idea of making the landmark graph acyclic is bad in the first place, and we should work on getting rid of this soon. In the meantime, some changes of behaviour within that function are not a concern to me.

(This is just a comment on this specific aspect; I did not look at experimental results more generally. Let me know if I should, or more generally if you're waiting for me to do something/comment on something.)
msg11017 (view) Author: clemens Date: 2023-02-14.15:21:19
To give some more context: The function *mk_acyclic_graph* (I'm very unhappy with that name, by the way, but also by the idea of the function itself of course) explores the landmark graph starting from every node one after the other (skipping those nodes for which it previously found that it cannot be part of any cycle). It marks nodes encountered as visited and procedes until a previously visited node is encountered. If this happens, it removes the first weakest ordering on the path from the visited node back to itself. Hence, this choice depends on the order in which these landmark nodes are considered/expanded and this might change in the different versions due to the use of an unordered set as the data structure. The reason why a different amount of orderings is removed is, because cycles can overlap in the set of orderings they consist of, and hence it is possible to get rid of multiple cycles at once by removing a single ordering. Since the choice by the implemented algorithm is greedy and it simply removes the first-weakest ordering, this does not have to be the same number over all for two different versions.
msg11016 (view) Author: simon Date: 2023-02-14.15:02:56
In the new line of experiments (v3) Clemens and I looked into the difference of 
orderings. For domains like trucks we waved it of as part of the translator non-
determinism. 
However, blocksworld for example is not affected by that but has different 
numbers of orderings. We found that the step which makes the landmark graph 
acyclic uses an unorederd set. We checked it locally for probBLOCKS-10-0.pddl to 
confirm that the set is the same. We assume that the different order in the set 
removes the landmark-orders differently resulting in a different LM-graph with 
more or fewer edges.
msg11011 (view) Author: simon Date: 2023-02-10.17:30:11
The current version has some form of hierarchy such that if a operator to a simple landmark can be preferred (due to them being interesting) only operators to simple landmarks are preferred. Disjunctive ones only if no simple was found.

In the PR: https://github.com/aibasel/downward/pull/149 conjunctive landmarks are now able to become interesting if the option is set. Also the hierarchy can be controlled. 

We did some experiments https://ai.dmi.unibas.ch/_experiments/ai/downward/issue1072/data/ 
It seems that the use of the hierarchy does not cause any benefit that is significant over the noise.

As we did not see any reason to have this hierarchy in the first place the goal is to remove it. Additionally, the conjunctive landmarks should be always used if the landmark factory produces them.

A further cleanup of the code is needed and a followup experiment, too.
msg10899 (view) Author: clemens Date: 2022-12-21.13:51:07
The lmcount heuristic ignores conjunctive landmarks when computing preferred operators. In particular, *lgraph->get_node(fact)* returns a nullpointer when there is neither a simple landmark for *fact* nor a disjunctive landmark including *fact*. We should discuss how to derive preferred operators also for conjunctive landmarks and not just let these cases silently fall through. 

(This came up when working on issue1070.)
History
Date User Action Args
2024-01-22 17:09:06salomesetmessages: + msg11534
2024-01-18 11:22:46clemenssetmessages: + msg11526
2024-01-17 11:48:12clemenssetmessages: + msg11525
2024-01-16 11:15:02clemenssetstatus: chatting -> in-progress
messages: + msg11522
2023-02-21 16:45:04maltesetmessages: + msg11031
2023-02-14 15:21:19clemenssetmessages: + msg11017
2023-02-14 15:02:56simonsetmessages: + msg11016
2023-02-10 17:30:11simonsetnosy: + simon
messages: + msg11011
2022-12-21 13:51:07clemenscreate