Issue 1128: Enable goal flaws in CEGAR PDBs. - Fast Downward issue tracker

Title	Enable goal flaws in CEGAR PDBs.
Priority	wish	Status	resolved
Superseder		Nosy List	clemens, florian, gabi, jendrik, malte, silvan
Assigned To		Keywords
Optional summary

Created on 2023-11-15.12:37:05 by clemens, last changed by malte.

Messages
msg11637 (view)	Author: malte	Date: 2024-07-16.12:55:59
Since the last comment, we had several live discussions on this issue and the associated pull request that brought some relevant new information. The following is adapted from a summary that Florian gave in a Discord discussion: =========================================================================== We had another meeting to discuss this issue today. To summarize the main points: - We compared the pseudo-code to the implementation again and found that the implementation matches the paper. The paper defines two variants. The first (disjoint_cegar) considers the original task both for initializing the pattern collection and for finding goal flaws. The second (multiple_cegar) works on tasks that are limited to a single goal (also in both places). In the paper this is done by passing a modified task to the CEGAR algorithm. (In addition there is an implementation that generates a single pattern called cegar_pattern but the issue is mostly about creating collections, so we ignore this here.) - With disjoint_cegar the new option should have no effect since this configuration already considers all goals. - Adapting multiple_cegar to use the full goal when checking for flaws would make it more similar to disjoint_cegar. Instead of an option, we could treat this as a third variant and its own pattern collection generator. It would remix the ideas of the other two generators and sit somewhere in between them (make multiple runs to generate single patterns like multiple_cegar but consider all goals for flaws like disjoint_cegar). - If we want this as a feature, we should compare it to both of the other generators and implement it as its own generator. It could still access the functionality from the common class CEGAR but we could avoid the new option in the existing generators. If we do this, this would also be the point to clean up the constructor of the CEGAR class and think about a way to cleanly support all three generators that use it. ================================================================================= We currently don't have immediate plans to pursue this, so following our general policy to avoid keeping feature suggestions open unless we have a tangible commitment to seeing them through, we are closing this issue. But if someone is interested in seeing this feature implemented, they are very welcome to reopen the issue or create a new one referencing this one and going ahead with it. Some important aspects in that case: - As discussed above, this might fit better as a third CEGAR-based PDB pattern collection generator rather than a new parameter setting of an existing generator. - An experimental evaluation should include a comparison to both existing pattern collection generators. - In a perfect world, we should also evaluate this with saturated cost partitioning, as the paper does and which is clearly the go-to method according to the results presented there. This could be done either by using Scorpion as the basis of the implementation or by adding saturated cost partitioning to Fast Downward. But evaluating on saturated cost partitioning should not be a blocker for adding a new collection generator.
msg11617 (view)	Author: florian	Date: 2024-07-09.16:27:12
I did a review of the code and think the change itself is quite straight-forward. I'm hesitating a bit, though, because the command line option is passed through many classes up to the point where it is needed (adding complexity). On the lowest level there is a class called CEGAR, which is already initialized with a task and a separate list of facts that are asserted to be goals of the task, so we are passing in a task and a subset of its goal. The patch adds a boolean option to ignore this passed-in subset and use the original goal of teh task in some but not all places (for example, in the initialization, the passed in subset is used). I think all of this is a bit clunky. It would be clearer if we could just pass in a task and use its goal. The surrounding code could then change the goal of the task before passing it in if necessary. Or, if the subset is just used to generate an initial pattern collection (looks like it), we could pass in an initial pattern collection rather than passing in a subset of goals and use its variables for this initial collection. With this approach, the code would be simpler if we didn't have to support an option for the old behavior but instead only support the behavior that Clemens suggested.
msg11616 (view)	Author: clemens	Date: 2024-07-09.12:09:08
I have changed the implementation so that the user can now choose which set of goals should be considered by a boolean command line option use_restricted_goal which is set to false by default. (I have discussed this default offline with Florian and Simon.) So I think the last thing to do is a code review. I'll look for a volunteer as part of the ongoing sprint.
msg11499 (view)	Author: clemens	Date: 2023-12-19.17:34:27
Thanks for the clarifications! So a possible solution would be to just add a boolean option to the CEGAR pattern generator that allows to choose whether one would prefer to consider all goals or just a single goals in the CEGAR loop. Would that be a reasonable approach? If so, what should be the default setting? I'm in favor of the more aesthetic option to consider all goals.
msg11498 (view)	Author: malte	Date: 2023-12-19.17:22:27
That message was in the context of domain abstractions and Cartesian abstractions (as it says), but some of the thoughts and considerations apply to all types of abstractions. You can ignore it if you like; it was meant primarily as a pointer that we already thought about this to some extent on the theory side and can dock onto these old thoughts if there is interest to think about this in depth. In that case, it's perhaps most efficient to discuss this live. The main point is that in the precondition-free case, certain aspects of the algorithms trivialize with "static tie-breaking-based" flaw selection/refinement strategies, which I think our standard strategies are. The question then is what the consequences of this observation are. In the PDB setting, these aspects don't really apply equally. The observation you made for the PDB CEGAR strategy isn't really limited to precondition-free operators. More generally, the idea of only caring about a single goal is a very aggressive form of pruning that makes these algorithms incomplete. This single-goal focus is not motivated by iPDB. The inability of iPDB to jump to not directly connected goals makes iPDB fail to converge to an optimum even with unbounded resources. This weaknesses was one of the motivations for the CEGAR approach, which does not have these difficulties if instantiated in a "vanilla" way. I think the single-goal focus was purely experimentally motivated: it just happened to work better than other alternatives that were considered. From an aesthetical perspective, I was never a fan of these kinds of approaches. Here we see that their theoretical limitations do indeed turn out to be problematic in some cases. It would be ideal to have an algorithm that is both aesthetical and works well on average. But the ideal cannot always be reached, in which case a good option is to offer multiple strategies to choose from.
msg11497 (view)	Author: clemens	Date: 2023-12-19.16:52:46
Hi Malte, I'm not sure I understand what you want to say in your last message. I understand that there is a connection to this issue regarding the fact that Rubik's Cube has precondition-free operators. This is indeed a specification that lead to me raising this issue, but I don't get the connection to the part about variable orderings. Let me summarize again what I want to address. (I also changed the title to better reflect that.) CEGAR for PDBs can never find goal flaws in the current implementation. The reason is that the pattern is initialized with a goal variable and this is the only variable considered relevant for reaching the goal in the CEGAR loop. Moreover, this means the only source of flaws are precondition flaws. With this, if there are no interactions with other variables (as is the case for e.g. Rubik's Cube), patterns will never become larger than that single goal variable. I suggest to consider all variables relevant to reach the goal in the original problem in the CEGAR loop. This enables goal flaws to occur (which is implemented but just never happens) which then adds more goal variables to the patterns even if there are no interactions between different variables. This leads to mixed results as described in msg11484. I have a hard time assessing these results on my own. If somebody else has an optinion, I would be very interested. If created a pull request for the very small change I suggest in the code: https://github.com/aibasel/downward/pull/200
msg11485 (view)	Author: malte	Date: 2023-11-15.14:16:51
I think this may have come up before, but I don't think we followed it up at the time. I found a note on the permutation puzzle Discord channel, but it is not very helpful (January 19, 2023, if you want to look it up). Here is a relevant bit: "When I thought about this a bit more when I got offline, there were a few more specific properties of Rubik's Cube and how they interact with the abstraction refinement that I would like to discuss at some point, such as the fact that the operators are precondition-free. I think the tie-breaking of the domain abstractions can be quite important. Do you know how this is done? If multiple goals are violated and the splitting strategy likes them equally well, is there consistent tie-breaking according to some variable order?" I think I thought about this first for Cartesian CEGAR in the "full" version where we consider the entire goal, not just a single goal or landmark subproblem. But it should apply equally well to CEGAR domain abstraction (and the behaviour for CEGAR PDBs is clear in any case). I think the reason why I brought up tie-breaking is the following: if there is a consistent tie-breaking strategy when selecting flaws, such as preferring the lowest variable ID, then we have an order v_1, ..., v_n on the state variables, and I think Cartesian CEGAR on precondition-free tasks will work as follows: first, refine only the first variable until the initial heuristic value h(I) gets to the projection heuristic value h1{v_1}(I). Then, refine v_1 and v_2 until h(I) gets to the projection heuristic value h1{v_1, v_2}(I). And so on.
msg11484 (view)	Author: clemens	Date: 2023-11-15.14:14:29
Alright, so we implemented the simple change. What I expected to happen was that maybe the patterns become a bit larger because more flaw candidates might need fixing. This might lead to better search performance at the expense of memory. We tested the CEGAR collection generator in combination with the cpdbs and zopdbs heuristics. Here are the results: https://ai.dmi.unibas.ch/_experiments/ai/downward/issue1128/data/cegar-pdb-goals-v1-eval/cegar-pdb-goals-v1-cegar-pdb-goals-base-cegar-pdb-goals-v1-compare.html The impact was larger than I expected. The scores for evaluated, expanded, and generated states go up significantly, by 30-100 points and total time is also significantly reduced (total time score goes up by ~50 for cpdbs and ~60 for zopdbs). This on its own sounds pretty nice to me! However, the coverage using the cpdbs heuristic goes down by 9 while it goes up for zopdbs by 42, so a mixed result (although the gain for zopdbs is also incredible to me). So for me the first impression persists: It seems to be worthwhile to consider this change. This is also why I created an issue for it. As it is, though, I don't have a good feeling about the reduced coverage. Since the algorithm depends on randomization, Silvan suggested to repeat the experiment and look at the average numbers. We did that but it looks pretty similar (https://ai.dmi.unibas.ch/_experiments/ai/downward/issue1128/data/cegar-pdb-goals-v1-average-eval/cegar-pdb-goals-v1-average-cegar-pdb-goals-base-cegar-pdb-goals-v1-compare.html). Besides the heuristics being more informed, we can see in the experiments also the shift to increased memory requirements. In particular, the differenc in coverage for cpdbs is explained by 56 more out-of-memory errors and 47 fewer out-of-time errors. We didn't test this explicitly but we assume this is because of larger patterns being generated after the change. In general we see that exhausting memory limits is the main reason for terminating the planner. So then we thought about how we could counteract this trend and one idea was to change the size limit for abstractions and abstraction collectsions. There pattern collection generator has parameters restricting the number of states for each individual abstraction and the number of states in the collection. By default, these are set to 1 million and 10 million, respectively. So I set up an experiment with smaller limits and this was the outcome: https://ai.dmi.unibas.ch/_experiments/ai/downward/issue1128/data/cegar-pdb-goals-v1-different-sizes-eval/cegar-pdb-goals-v1-different-sizes-cegar-pdb-goals-base-cegar-pdb-goals-v1-compare.html It shows that for zopdbs the default seems reasonable; both before and after the change we achieve the highest coverage with these limits. For cpdbs, however, 100k/10M performs better even before the change. With these limits, the change then has a positive effect on coverage. Hence, we achieve the highest coverage with cpdbs and the CEGAR pattern collection generator by changing the size limits and doing the suggested change to consider goal flaws in the pattern generation, while for zopdbs highest coverage is achieved by not changing the size limits but doing the suggested change regarding goal flaws. For what it's worth: I recall that when we wrote the paper on domain abstractions we also observed that changing the size limits for PDBs with saturated cost partitioning as the heuristic it was also beneficial to reduce the size limits of individual abstractions to 100k while the collection could have 10M states. Now the big question is what to make of all of this. I personally think it makes a lot of sense to actually consider goal flaws in the CEGAR implementation for PDBs. It's useful both on the theoretical side and has very positive effects in practice, except for overall coverage. Regarding size limits I don't have a strong opinion. If we can live with reduced coverage given the other positive effects, then we don't have to do anything in my opinion. However, since we're already at it and have seen in the last experiment that smaller size limits actually lead to better performance for cpdbs (which I consider the most state of the art PDB heuristic integrated in Fast Downward), we could also adapt these defaults. Did I miss something important or misinterpret anything? Do you have any comments or thoughts you'd like to add to the discussion?
msg11483 (view)	Author: jendrik	Date: 2023-11-15.13:09:10
Interesting! A similar thing is done for the Cartesian goal abstractions (build one abstraction per goal fact). If the goal facts are easy to reach individually, building these abstractions is very quick, so we are currently working on methods that compute Cartesian goal abstractions considering multiple goal facts.
msg11482 (view)	Author: clemens	Date: 2023-11-15.12:37:05
I have observed a property of the CEGAR PDB implementation that I would like to discuss. I talked about it to Silvan and we have made some preliminary experiments, but now I would like to open it up for further opinions. For this post, I assume you are familiar with the basics of the CEGAR algorithm. So the thing I noticed is, that the pattern collection generator based on CEGAR considers exactly one variable to be relevant for the goal while building the pattern. From what I understand, this implementation choice is inspired by the way patterns are generated using hillclimbing: Start with a collection containing one atomic projections to each goal variable and continue from there. The CEGAR approach also starts with an atomic projection of a single goal variable and adds more variables to the patterns according to the flaws found. (Repeat with different goal variables to get a collection.) Flaws in the PDB case are either (1) an abstract plan is not applicable because of an unsatisfied precondition or (2) the abstract plan does not lead to a goal. However, recall that I mentioned in the CEGAR algorithm the goal is projected onto the single goal variable it is seeded with, so flaws of type (2) will never occur. Now this is not necessarily problematic and we have seen that it works well in practice anyway. However, I came to stumble over this with a problem of the following form: All variables are goal variables and operators have no preconditions. Consequently, flaws of type (1) will also never occur. With this kind of problem, the patterns we get are just atomic projections which are not necessarily informative. There is a way to end up with larger patterns in this example and it is easy enough: Instead of projecting the goal in the CEGAR algorithm to the single variable, just consider the original goal. At this point I'll submit this message to obtain an issue number, then I can share the results with you.

History
Date	User	Action	Args
2024-07-16 12:56:09	malte	set	status: in-progress -> resolved
2024-07-16 12:55:59	malte	set	messages: + msg11637
2024-07-09 16:27:12	florian	set	nosy: + florian messages: + msg11617
2024-07-09 12:09:08	clemens	set	status: chatting -> in-progress messages: + msg11616
2023-12-19 17:34:27	clemens	set	messages: + msg11499
2023-12-19 17:22:27	malte	set	messages: + msg11498
2023-12-19 16:52:46	clemens	set	messages: + msg11497 title: Extend set of candidates for goal flaws in CEGAR PDBs. -> Enable goal flaws in CEGAR PDBs.
2023-11-15 14:16:52	malte	set	messages: + msg11485
2023-11-15 14:14:29	clemens	set	messages: + msg11484
2023-11-15 13:09:10	jendrik	set	messages: + msg11483
2023-11-15 12:38:16	gabi	set	nosy: + gabi
2023-11-15 12:37:05	clemens	create

Issue1128