Issue1128

Title Enable goal flaws in CEGAR PDBs.
Priority wish Status chatting
Superseder Nosy List clemens, gabi, jendrik, malte, silvan
Assigned To Keywords
Optional summary

Created on 2023-11-15.12:37:05 by clemens, last changed by clemens.

Messages
msg11499 (view) Author: clemens Date: 2023-12-19.17:34:27
Thanks for the clarifications! So a possible solution would be to just add a boolean option to the CEGAR pattern generator that allows to choose whether one would prefer to consider all goals or just a single goals in the CEGAR loop. Would that be a reasonable approach? If so, what should be the default setting? I'm in favor of the more aesthetic option to consider all goals.
msg11498 (view) Author: malte Date: 2023-12-19.17:22:27
That message was in the context of domain abstractions and Cartesian abstractions (as it says), but some of the thoughts and considerations apply to all types of abstractions. You can ignore it if you like; it was meant primarily as a pointer that we already thought about this to some extent on the theory side and can dock onto these old thoughts if there is interest to think about this in depth. In that case, it's perhaps most efficient to discuss this live.

The main point is that in the precondition-free case, certain aspects of the algorithms trivialize with "static tie-breaking-based" flaw selection/refinement strategies, which I think our standard strategies are. The question then is what the consequences of this observation are. In the PDB setting, these aspects don't really apply equally.


The observation you made for the PDB CEGAR strategy isn't really limited to precondition-free operators. More generally, the idea of only caring about a single goal is a very aggressive form of pruning that makes these algorithms incomplete. 

This single-goal focus is not motivated by iPDB. The inability of iPDB to jump to not directly connected goals makes iPDB fail to converge to an optimum even with unbounded resources. This weaknesses was one of the motivations for the CEGAR approach, which does not have these difficulties if instantiated in a "vanilla" way. 

I think the single-goal focus was purely experimentally motivated: it just happened to work better than other alternatives that were considered. From an aesthetical perspective, I was never a fan of these kinds of approaches. Here we see that their theoretical limitations do indeed turn out to be problematic in some cases. It would be ideal to have an algorithm that is both aesthetical and works well on average. But the ideal cannot always be reached, in which case a good option is to offer multiple strategies to choose from.
msg11497 (view) Author: clemens Date: 2023-12-19.16:52:46
Hi Malte, I'm not sure I understand what you want to say in your last message. I understand that there is a connection to this issue regarding the fact that Rubik's Cube has precondition-free operators. This is indeed a specification that lead to me raising this issue, but I don't get the connection to the part about variable orderings.

Let me summarize again what I want to address. (I also changed the title to better reflect that.)
CEGAR for PDBs can never find goal flaws in the current implementation. The reason is that the pattern is initialized with a goal variable and this is the only variable considered relevant for reaching the goal in the CEGAR loop. Moreover, this means the only source of flaws are precondition flaws. With this, if there are no interactions with other variables (as is the case for e.g. Rubik's Cube), patterns will never become larger than that single goal variable.

I suggest to consider all variables relevant to reach the goal in the original problem in the CEGAR loop. This enables goal flaws to occur (which is implemented but just never happens) which then adds more goal variables to the patterns even if there are no interactions between different variables. This leads to mixed results as described in msg11484. I have a hard time assessing these results on my own. If somebody else has an optinion, I would be very interested.

If created a pull request for the very small change I suggest in the code: https://github.com/aibasel/downward/pull/200
msg11485 (view) Author: malte Date: 2023-11-15.14:16:51
I think this may have come up before, but I don't think we followed it up at the time. I found a note on the permutation puzzle Discord channel, but it is not very helpful (January 19, 2023, if you want to look it up). Here is a relevant bit:

"When I thought about this a bit more when I got offline, there were a few more specific properties of Rubik's Cube and how they interact with the abstraction refinement that I would like to discuss at some point, such as the fact that the operators are precondition-free.
I think the tie-breaking of the domain abstractions can be quite important. Do you know how this is done? If multiple goals are violated and the splitting strategy likes them equally well, is there consistent tie-breaking according to some variable order?"

I think I thought about this first for Cartesian CEGAR in the "full" version where we consider the entire goal, not just a single goal or landmark subproblem. But it should apply equally well to CEGAR domain abstraction (and the behaviour for CEGAR PDBs is clear in any case).

I think the reason why I brought up tie-breaking is the following: if there is a consistent tie-breaking strategy when selecting flaws, such as preferring the lowest variable ID, then we have an order v_1, ..., v_n on the state variables, and I think Cartesian CEGAR on precondition-free tasks will work as follows: first, refine only the first variable until the initial heuristic value h(I) gets to the projection heuristic value h*1{v_1}(I). Then, refine v_1 and v_2 until h(I) gets to the projection heuristic value h*1{v_1, v_2}(I). And so on.
msg11484 (view) Author: clemens Date: 2023-11-15.14:14:29
Alright, so we implemented the simple change. What I expected to happen was that maybe the patterns become a bit larger because more flaw candidates might need fixing. This might lead to better search performance at the expense of memory. We tested the CEGAR collection generator in combination with the cpdbs and zopdbs heuristics.

Here are the results:
https://ai.dmi.unibas.ch/_experiments/ai/downward/issue1128/data/cegar-pdb-goals-v1-eval/cegar-pdb-goals-v1-cegar-pdb-goals-base-cegar-pdb-goals-v1-compare.html

The impact was larger than I expected. The scores for evaluated, expanded, and generated states go up significantly, by 30-100 points and total time is also significantly reduced (total time score goes up by ~50 for cpdbs and ~60 for zopdbs). This on its own sounds pretty nice to me! However, the coverage using the cpdbs heuristic goes down by 9 while it goes up for zopdbs by 42, so a mixed result (although the gain for zopdbs is also incredible to me).

So for me the first impression persists: It seems to be worthwhile to consider this change. This is also why I created an issue for it. As it is, though, I don't have a good feeling about the reduced coverage. Since the algorithm depends on randomization, Silvan suggested to repeat the experiment and look at the average numbers. We did that but it looks pretty similar (https://ai.dmi.unibas.ch/_experiments/ai/downward/issue1128/data/cegar-pdb-goals-v1-average-eval/cegar-pdb-goals-v1-average-cegar-pdb-goals-base-cegar-pdb-goals-v1-compare.html).

Besides the heuristics being more informed, we can see in the experiments also the shift to increased memory requirements. In particular, the differenc in coverage for cpdbs is explained by 56 more out-of-memory errors and 47 fewer out-of-time errors. We didn't test this explicitly but we assume this is because of larger patterns being generated after the change. In general we see that exhausting memory limits is the main reason for terminating the planner. So then we thought about how we could counteract this trend and one idea was to change the size limit for abstractions and abstraction collectsions. There pattern collection generator has parameters restricting the number of states for each individual abstraction and the number of states in the collection. By default, these are set to 1 million and 10 million, respectively. So I set up an experiment with smaller limits and this was the outcome: https://ai.dmi.unibas.ch/_experiments/ai/downward/issue1128/data/cegar-pdb-goals-v1-different-sizes-eval/cegar-pdb-goals-v1-different-sizes-cegar-pdb-goals-base-cegar-pdb-goals-v1-compare.html

It shows that for zopdbs the default seems reasonable; both before and after the change we achieve the highest coverage with these limits. For cpdbs, however, 100k/10M performs better even before the change. With these limits, the change then has a positive effect on coverage. Hence, we achieve the highest coverage with cpdbs and the CEGAR pattern collection generator by changing the size limits *and* doing the suggested change to consider goal flaws in the pattern generation, while for zopdbs highest coverage is achieved by not changing the size limits but doing the suggested change regarding goal flaws. For what it's worth: I recall that when we wrote the paper on domain abstractions we also observed that changing the size limits for PDBs with saturated cost partitioning as the heuristic it was also beneficial to reduce the size limits of individual abstractions to 100k while the collection could have 10M states.

Now the big question is what to make of all of this. I personally think it makes a lot of sense to actually consider goal flaws in the CEGAR implementation for PDBs. It's useful both on the theoretical side and has very positive effects in practice, except for overall coverage. Regarding size limits I don't have a strong opinion. If we can live with reduced coverage given the other positive effects, then we don't have to do anything in my opinion. However, since we're already at it and have seen in the last experiment that smaller size limits actually lead to better performance for cpdbs (which I consider the most state of the art PDB heuristic integrated in Fast Downward), we could also adapt these defaults.

Did I miss something important or misinterpret anything? Do you have any comments or thoughts you'd like to add to the discussion?
msg11483 (view) Author: jendrik Date: 2023-11-15.13:09:10
Interesting! A similar thing is done for the *Cartesian* goal abstractions (build one abstraction per goal fact). If the goal facts are easy to reach individually, building these abstractions is very quick, so we are currently working on methods that compute Cartesian goal abstractions considering multiple goal facts.
msg11482 (view) Author: clemens Date: 2023-11-15.12:37:05
I have observed a property of the CEGAR PDB implementation that I would like to discuss. I talked about it to Silvan and we have made some preliminary experiments, but now I would like to open it up for further opinions. For this post, I assume you are familiar with the basics of the CEGAR algorithm.

So the thing I noticed is, that the pattern collection generator based on CEGAR considers exactly one variable to be relevant for the goal while building the pattern. From what I understand, this implementation choice is inspired by the way patterns are generated using hillclimbing: Start with a collection containing one atomic projections to each goal variable and continue from there. The CEGAR approach also starts with an atomic projection of a single goal variable and adds more variables to the patterns according to the flaws found. (Repeat with different goal variables to get a collection.) Flaws in the PDB case are either (1) an abstract plan is not applicable because of an unsatisfied precondition or (2) the abstract plan does not lead to a goal. However, recall that I mentioned in the CEGAR algorithm the goal is projected onto the single goal variable it is seeded with, so flaws of type (2) will never occur.

Now this is not necessarily problematic and we have seen that it works well in practice anyway. However, I came to stumble over this with a problem of the following form: All variables are goal variables and operators have no preconditions. Consequently, flaws of type (1) will also never occur. With this kind of problem, the patterns we get are just atomic projections which are not necessarily informative.

There is a way to end up with larger patterns in this example and it is easy enough: Instead of projecting the goal in the CEGAR algorithm to the single variable, just consider the original goal. At this point I'll submit this message to obtain an issue number, then I can share the results with you.
History
Date User Action Args
2023-12-19 17:34:27clemenssetmessages: + msg11499
2023-12-19 17:22:27maltesetmessages: + msg11498
2023-12-19 16:52:46clemenssetmessages: + msg11497
title: Extend set of candidates for goal flaws in CEGAR PDBs. -> Enable goal flaws in CEGAR PDBs.
2023-11-15 14:16:52maltesetmessages: + msg11485
2023-11-15 14:14:29clemenssetmessages: + msg11484
2023-11-15 13:09:10jendriksetmessages: + msg11483
2023-11-15 12:38:16gabisetnosy: + gabi
2023-11-15 12:37:05clemenscreate