Issue1087

Title Failing Assertion in Landmark Graph with h^m Factory
Priority bug Status resolved
Superseder Nosy List clemens, jendrik, malte, salome, silvan
Assigned To Keywords
Optional summary

Created on 2023-06-06.12:39:50 by clemens, last changed by clemens.

Messages
msg11422 (view) Author: clemens Date: 2023-10-04.18:05:18
Thank you both, this is merged now.
msg11420 (view) Author: malte Date: 2023-10-04.16:26:03
Looks good to me too. (Linking to our discussion in the Fast Downward meeting, I think I can't approve the pull request because it has already been approved.)
msg11418 (view) Author: salome Date: 2023-10-04.15:50:51
Looks good to me and I agree that we don't need experiments.
msg11417 (view) Author: clemens Date: 2023-10-04.15:32:57
I've implemented the change and created a pull request: https://github.com/aibasel/downward/pull/185
I have verified that the assertion does not fire anymore on the task from the original issue report message. I don't think experiments are necessary, since the release build is not affected. (We talked about running experiments in debug mode in the past, but that's probably an issue of it's own.)

Salomé, could you have a look at the pull request and approve it if you agree with the change?
msg11416 (view) Author: salome Date: 2023-10-03.16:17:57
Yes, sorry, this was exactly what I meant :).
msg11415 (view) Author: malte Date: 2023-10-03.16:05:53
Makes sense to me -- you know the code better than me.
msg11412 (view) Author: clemens Date: 2023-10-03.13:30:59
Ah, good catch! Yes, this is the assertion in question. I see the argument that the existence of the hashmaps within landmark graphs require uniqueness and therefore the assertion makes sense when adding a new landmark.

Then what you suggest essentially amounts to my previous suggestion in msg11088: Assert that either the added landmark is conjunctive or it does not overlap with others (when it is simple or disjunctive). Or did I misunderstand?
msg11411 (view) Author: salome Date: 2023-10-03.13:12:30
I presume the assertion is question is that one? (lines 105-107 of landmark_graph.cc)

    assert(all_of(landmark.facts.begin(), landmark.facts.end(), [&](const FactPair &lm_fact) {
                      return !contains_landmark(lm_fact);
                  }));

What is a hidden by this code is that contains_landmark() actually only checks if there is a simple or disjunctive landmark containing this fact. So having several conjunctive landmarks overlap is fine, but it will trigger if for example we already have a simple landmark "a" and try to add a conjunctive landmark "a ^ b".

The reason why it only checks for simple and disjunctive is that for these two types the landmark graph has a hashmap each that (uniquely) maps facts to a simple/disjunctive landmark, so I'd argue landmark_graph in a sense depends itself on the fact that simple/disjunctive landmarks don't overlap. As far as I can tell these hashmaps are only used by rhw, merged and zg (for zg only the simple variant), and of course should vanish once we allow overlapping in general, but I think for the moment we need to enforce no overlapping between simple/disjunctive landmarks for any factory since the graph itself assumes it.

I think an alternative fix for this issue could be that we just don't do the assertion when adding conjunctive landmarks, because the code doesn't depend on conjunctive landmarks not overlapping with anything.
msg11410 (view) Author: clemens Date: 2023-10-03.12:33:36
I'm not sure what my suggestion is either. ;-) I opened the issue because of the failing assertion with the priority "bug", so I definitely started off with the idea to just fix the assertion. While writing the issue message, I kind of derailed the conversation to this bigger picture idea of removing the assumption that landmarks do not need to overlap in general. However, there is issue257 that is the better place to have this discussion in my opinion.

To make a concrete suggestions: I think we should just fix the assertion for now. I don't know when we're going to be able to work on the other issue and we should avoid having failing assertions in released code. 

Regarding how to fix the assertion, I suggest the following: Instead of having the assertion in the code that adds the landmark, we could instead check the property in the landmark factories that rely on it, which is just lm_rhw I think. I can check whether other factories rely on it when we decide this is what we want for this issue.
msg11408 (view) Author: malte Date: 2023-10-03.11:35:55
The restriction to non-overlapping landmarks shouldn't be there. There is no good reason for it. I think it has its origins in what we now call the lm_rhw landmark generator, which needed ways to limit which disjunctive landmarks to consider because there are too many possibilities to consider all. One of the decisions taken to avoid explosion was "don't consider overlapping landmarks".

But this should really be a design decision of this landmark factory, not a property of the landmark code overall. But historically this was the only landmark generation method we had, with landmark generation, representation, heuristics etc. all baked into one, and this was not properly generalized when we started teasing these aspects apart.

Unfortunately, I think the assumption is baked into the interface in a few places in ways that make changing this nontrivial. (I think there are methods like "give me *the* landmark including this atom" etc.) There is no *conceptual* difficulty with getting rid of these assumptions, though. It would be great if someone started looking into this.

Is this the scope of the issue, or is the scope just to fix the assertion? It's not so clear to me what you're suggesting.
msg11088 (view) Author: clemens Date: 2023-06-06.12:39:50
Some parts of the landmark code rely on landmarks not to overlap. The function where new landmarks are added to the landmark graph asserts that no other landmark contains any of the facts from the landmark to be added. This assertion sometimes fails for the h^m landmark factory with m > 1. One such example is psr-small/p11-s18-n2-l2-f50.pddl. 

I'm not quite sure what this entails in the release build and whether this might become problematic somewhere else. I also don't know how we should address this issue. Maybe in the case of conjunctive landmarks we can allow overlaps in which case we should adapt the assertion to something like 

assert(conjunctive || X)

where X is what what is currently asserted. However, I generally dislike that we enforce landmarks to not overlap during construction. Doing so disregards potentially valuable information. If an algorithm (e.g. the landmark sum heuristic) would prefer landmarks that do not overlap, then this should rather be a post-processin in my opinion. Of course this adds precomputation time for these algorithms and maybe sometimes this is not preferrable, but in the bigger picture I would prefer this over the current implementation.
History
Date User Action Args
2023-10-04 18:05:18clemenssetstatus: in-progress -> resolved
messages: + msg11422
2023-10-04 16:26:03maltesetmessages: + msg11420
2023-10-04 15:50:51salomesetmessages: + msg11418
2023-10-04 15:32:57clemenssetstatus: chatting -> in-progress
messages: + msg11417
2023-10-03 16:17:57salomesetmessages: + msg11416
2023-10-03 16:05:53maltesetmessages: + msg11415
2023-10-03 13:30:59clemenssetmessages: + msg11412
2023-10-03 13:12:30salomesetmessages: + msg11411
2023-10-03 12:33:36clemenssetmessages: + msg11410
2023-10-03 11:35:55maltesetmessages: + msg11408
2023-06-06 12:39:50clemenscreate