Created on 2014-08-16.03:28:28 by patrik, last changed by salome.
See meta-issue: issue924.
See msg8204 of issue862 for an overview of how the axiom code currently works.
(May become out of date while we're working on the axiom code further.)
Check for references to "issue453" in the code, specifically axiom_rules.py.
TODOs:
1. Have a translator option that implements the new strategy with the trivial overapproximation. Make sure that relaxed reachability and relevance analysis play nicely with this. Make sure that the axiom evaluator in the search code ignores the fake rules. (In general, to make this work also with the old behaviour, we can ignore all rules where the head is the default value of the variable.)
2. Later, we probably want to try more sophisticated things, e.g. don't overapproximate if there are no cycles.
|
msg9667 (view) |
Author: salome |
Date: 2020-07-22.13:40:14 |
|
I updated issue924 with the new referenced and created a new issue (issue978) for removing axiom layers from the translator output.
Given that this issue was merged and the discussion results have been documented in relevant issues, I mark this issue as resolved.
|
msg9658 (view) |
Author: malte |
Date: 2020-07-20.12:50:55 |
|
> It would be good to create an issue -- if we don't have one yet -- for this
> potential follow-up work (removing negated rules from the translator, removing axiom
> layers from the output, removing default values for axioms from the output, making
> all derived variables binary).
After searching a bit, it looks like issue454 covers most of it, and perhaps the things in this list not yet covered by it could be referenced from it.
I think it would be useful to have a meta-issue on axioms in general. I have a saved search with keyword "translator", and it shows a good number of issues related to axioms that I think it would make sense to have an overview of in one place: issue924 (already a meta-issue), issue450, issue454, issue54, issue161, perhaps parts of issue371.
Salomé, can you do one or both of these?
> I updated issue454 according to this, but I'm not so sure anymore what you
> meant with removing axiom layers from the output. Do you rather want to
> compute them in the search component?
Yes and no. I think having layers associated as an intrinsic attribute of a derived variables isn't the best design in the first place. The layers are not so much a property of the variables as they are a property of the algorithms that use them. Within the search component, the axiom evaluator might want to use one layering strategy while a given heuristic might want to use another one. So I think it would make more sense for the layers to be computed by the users that know whether they would prefer as few layers as possible or layers with as few variables as possible. A derived variable would not "have" a layer. Some algorithms would instead compute and store layers for the variables/rules.
This may lead to some redundant computations if there are multiple users that want to use the same layering strategy, but that's not a good enough reason to have the layers as part of the search component's input. Historically, we've gone the same way with other things, removing causal graphs, domain transition graphs and successor generators that all used to be part of the search component's input.
|
msg9650 (view) |
Author: salome |
Date: 2020-07-17.17:53:54 |
|
I quote your comment from the pull request here:
It would be good to create an issue -- if we don't have one yet -- for this potential
follow-up work (removing negated rules from the translator, removing axiom layers
from the output, removing default values for axioms from the output, making all
derived variables binary).
I updated issue454 according to this, but I'm not so sure anymore what you meant
with removing axiom layers from the output. Do you rather want to compute them in
the search component?
|
msg9647 (view) |
Author: malte |
Date: 2020-07-17.17:20:00 |
|
Merged and pushed.
If I remember correctly, I left some TODOs regarding followups in the last pull request discussions. Salomé or Augusto, can you have a look and put them somewhere relevant in the tracker, so that they don't get lost? Perhaps in the meta-issue for this if it fits? I don't recall the details, and I'm not sure how to find them again, but I hope you got notifications about them. If not, I can try to dig them up next week.
Once this is done, this one can be marked as resolved and the meta-issue can be updated.
|
msg9645 (view) |
Author: malte |
Date: 2020-07-17.11:04:03 |
|
You can do it now or I can do it later today. (I don't mind doing it, it's a quick change.)
|
msg9644 (view) |
Author: salome |
Date: 2020-07-16.22:34:22 |
|
Yay! :) You also mentioned in the pull request that we should stay consistent with string formatting styles, but at that point I already changed it to Jendrik's suggestion. I can either change it myself tomorrow or you do it before merging, whichever way you prefer.
|
msg9643 (view) |
Author: malte |
Date: 2020-07-16.21:40:55 |
|
No problem, I was just confused by the way the merge conflict message was phrased, but it was easy to solve by just creating a new clone of the repository.
I had a closer look at which axioms exactly are pruned in this openstacks task, and the pruning makes sense.
We detect an invariant that every object is in at most one of three states: waiting, started, or shipped. There is a late stage in the processing that detects that it's *exactly one* of these states rather than *at most one* of these states, and the axioms being pruned are the ones that (impossibly) require that a given object is in none of these three states.
This is related to the way we get rid of negative conditions on non-binary finite-domain variables in axioms. Say an axiom has the condition "(not (waiting o3))". We first transform this into three axioms with conditions "(started o3)", "(shipped o3)", and "(none-of-the-three-states o3)". And then later that last one is discarded because "(none-of-the-three-states o3)" is detected as unreachable.
Unless I missed something, we have no other open things here, so if nobody has any objections, I'll merge this. Hopefully tomorrow, otherwise ASAP.
|
msg9642 (view) |
Author: salome |
Date: 2020-07-16.19:09:57 |
|
Oups sorry! I changed the commit messages two commits to prepend [issue453] and wanted to mention it here but completely forgot about it >.<. I hope it doesn't cause too much issues; it's the two commits after "Reverse clusters for layer computation." (commit f9309e7312655626e73960da3eae0e781828eb21).
|
msg9641 (view) |
Author: malte |
Date: 2020-07-16.18:50:39 |
|
Hi Salomé, did you force-push something into your repository? I get merge conflicts when I pull, but they seem to be between two different code versions by you.
|
msg9636 (view) |
Author: salome |
Date: 2020-07-16.12:40:09 |
|
I made the suggested changes to the translator output, removed the custom parser and added the missing experiment files. I also updated the pull request by marking most conversations as resolved, but please feel free to unmark them if you disagree.
|
msg9634 (view) |
Author: salome |
Date: 2020-07-16.11:08:47 |
|
I looked into the newly removed axioms and used openstacks/p01 as an example. The axioms that are removed in v5 are all setting a derived variable to "true". In base, these derived variables all have default value "true", which means the axioms setting them to true were already removed in axiom_rules (axioms removed this way never occured in any statistics).
Should I take a closer look into other problems or are we satisfied enough with this explanation?
|
msg9631 (view) |
Author: malte |
Date: 2020-07-15.20:23:58 |
|
I looked at the code on github again and left some comments. The comments involve some followup we might want to do, but not within this issue.
If I'm not missing something, this means looking at translator_axioms_removed (msg9618) is the last open task for this issue.
|
msg9620 (view) |
Author: malte |
Date: 2020-07-15.15:04:17 |
|
I did local runs on Airport-ADL #23 for base and the current configuration with min and max layers. I didn't set a time or memory limit, so there were no timeouts or running out of memory, but you can see what is going on based on the outputs. Here are excerpts for f layers 55 to 65.
base:
[t=2010.51s, 2399108 KB] f = 55, 9064634 evaluated, 8343622 expanded
[t=2179.58s, 2590012 KB] f = 56, 9830328 evaluated, 9064634 expanded
[t=2363.62s, 2790336 KB] f = 57, 10642835 evaluated, 9830328 expanded
[t=2558.67s, 3002896 KB] f = 58, 11504629 evaluated, 10642835 expanded
[t=2764.61s, 3228332 KB] f = 59, 12418576 evaluated, 11504629 expanded
[t=2985.98s, 3467440 KB] f = 60, 13387946 evaluated, 12418576 expanded
[t=3219.97s, 3852044 KB] f = 61, 14416425 evaluated, 13387946 expanded
[t=3466.89s, 4121296 KB] f = 62, 15508142 evaluated, 14416425 expanded
[t=3731.36s, 4407240 KB] f = 63, 16667683 evaluated, 15508142 expanded
[t=4013.08s, 4711416 KB] f = 64, 17900057 evaluated, 16667683 expanded
[t=4310.38s, 5038760 KB] f = 65, 19210636 evaluated, 17900057 expanded
new, min layers:
[t=1194.57s, 2399752 KB] f = 55, 9064634 evaluated, 8343622 expanded
[t=1297.3s, 2590536 KB] f = 56, 9830328 evaluated, 9064634 expanded
[t=1407.56s, 2790956 KB] f = 57, 10642835 evaluated, 9830328 expanded
[t=1523.73s, 3003456 KB] f = 58, 11504629 evaluated, 10642835 expanded
[t=1645.24s, 3228920 KB] f = 59, 12418576 evaluated, 11504629 expanded
[t=1775.55s, 3467996 KB] f = 60, 13387946 evaluated, 12418576 expanded
[t=1913.72s, 3852648 KB] f = 61, 14416425 evaluated, 13387946 expanded
[t=2058.45s, 4121900 KB] f = 62, 15508142 evaluated, 14416425 expanded
[t=2215.43s, 4407860 KB] f = 63, 16667683 evaluated, 15508142 expanded
[t=2382.49s, 4712072 KB] f = 64, 17900057 evaluated, 16667683 expanded
[t=2560.58s, 5039244 KB] f = 65, 19210636 evaluated, 17900057 expanded
new, max layers:
[t=1939.65s, 2399696 KB] f = 55, 9064634 evaluated, 8343622 expanded
[t=2106.56s, 2590564 KB] f = 56, 9830328 evaluated, 9064634 expanded
[t=2284.78s, 2790996 KB] f = 57, 10642835 evaluated, 9830328 expanded
[t=2472.18s, 3003516 KB] f = 58, 11504629 evaluated, 10642835 expanded
[t=2675s, 3228972 KB] f = 59, 12418576 evaluated, 11504629 expanded
[t=2885.79s, 3468028 KB] f = 60, 13387946 evaluated, 12418576 expanded
[t=3111.25s, 3852680 KB] f = 61, 14416425 evaluated, 13387946 expanded
[t=3350.35s, 4121924 KB] f = 62, 15508142 evaluated, 14416425 expanded
[t=3605.76s, 4407900 KB] f = 63, 16667683 evaluated, 15508142 expanded
[t=3874.18s, 4712068 KB] f = 64, 17900057 evaluated, 16667683 expanded
[t=4160.49s, 5039296 KB] f = 65, 19210636 evaluated, 17900057 expanded
I think we can see that memory usage is not significantly affected, and the min layer code is significantly faster than either baseline or the max layer code.
|
msg9618 (view) |
Author: malte |
Date: 2020-07-15.13:36:32 |
|
Test is running.
@Augusto: Salomé updated the HTML report to include the "translator_simplified_axioms" attribute. If you're interested, you may have to Shift-Reload.
Salomé and I also looked at the experiment reports together. We had a question mark about where the changes to "translator_axioms_removed" come from in the domains where they changed from 0 to something larger than 0. I'll be in meetings soon, but perhaps I can look at this in the evening. Apart from this point, it looks good to me, and it looks like we solved the problems with coverage and runtime. (There is some occasional slowdown, but I would say nothing to be worried about considering that the old code was wrong.)
|
msg9617 (view) |
Author: malte |
Date: 2020-07-15.13:19:03 |
|
If you compare the runtime on this domain (https://ai.dmi.unibas.ch/_tmp_files/simon/issue453-v5.html#total_time-airport-adl), the new code (without the max layers option) is a lot faster than the old one. Not sure why, perhaps it's the change to the axiom evaluator that skips over axioms setting the default value. If the code is faster, then it will run out of memory more frequently rather than reaching the timeout.
Blind search running out of time is actually a pretty bad sign, it means we don't manage to expand states quickly. It would be interesting to see what the bottlenecks are in the tasks where it does run out of time (e.g. in trucks and miconic-fulladl), but of course this is not what we need to look at for this issue.
I'll do a test with Airport-ADL #23 to see if the output is consistent with the hypothesis that faster planner speed is responsible for these out-of-memory conditions.
|
msg9616 (view) |
Author: augusto |
Date: 2020-07-15.12:16:38 |
|
Cool! The only question I have (related to what you mentioned in msg9577) is why blind-v5 (w/ min-layer strategy) has way more out-of-memory errors than the base version and the version using max-layers in airport-adl? Almost half of the tasks (24 of 50) get OOM with min-layer and only 5 run out of memory with the base version. However, all tasks where blind-v5 had an OOM, the other strategies also failed (with OOM or timeout).
|
msg9614 (view) |
Author: salome |
Date: 2020-07-15.10:26:42 |
|
I addressed the comments of the last code review and reran then experiments:
https://ai.dmi.unibas.ch/_tmp_files/simon/issue453-v5.html
Looks good to me. Coverage even goes up a tad for ff-lazy und lama-first, time scores are sometimes better (maybe because in axioms.cc we ignore axioms that set the derived variable on its default value now) and I did not see any more deviations in expansions_until_last_jump.
|
msg9605 (view) |
Author: malte |
Date: 2020-07-13.18:36:03 |
|
I did a code review.
|
msg9602 (view) |
Author: salome |
Date: 2020-07-13.13:51:45 |
|
I found the reason why the expansions until last jump differed: We accidentally introduced a bug that does not apply the result of simplify() to the axioms and we thus have dominated axioms with derived variables that are not used anywhere else and thus result in a weaker relevance analysis.
I addressed this bug, as well as a bug we uncovered during Code review (simplify iterated over a list that was changed during the iteration), and addressed the pull request comments. I also tested one problem at random from each experiment domain with the DEBUG flag set to True and did not find any issues.
Does anyone want to look at the updated pull request? (https://github.com/salome-eriksson/downward/pull/1). I would like to run the experiments tomorrow at the latest so we can hopefully merge this before the new release.
|
msg9595 (view) |
Author: malte |
Date: 2020-07-12.16:26:13 |
|
OK, I had a close look at this task, and it looks like this is due to the changes in the relevance analysis that we made in this issue. The behaviour of the old version, which is able to prune certain things (axioms and operators, derived and non-derived variables) looks plausible, even though I did not fully verify all aspects.
In this issue, we weakened the relevance analysis by only considering derived variables, and it seems that this might be what actually makes the difference here. On the other hand, we also fixed potential bugs in the relevance analysis related to conditional effects and related to negative conditions, so not 100% sure how this would play out with relevance analysis that is both correct and includes the non-derived variables. But given the coverage loss of lama-first, I think we should try this out.
In the previous iteration, we still had the full relevance analysis as an option. Would it make sense to look at these results again, with an emphasis on PSR-Middle and PSR-Large, astar(blind()) and lama-first?
|
msg9594 (view) |
Author: malte |
Date: 2020-07-12.12:19:15 |
|
I started having a look at the varying number of expansions before last jump between the head and base of the issue branch with astar(blind()). This does not necessarily have to be a bug in the old or new code; it could also be a difference in the relevance analysis without the old relevance analysis necessarily being incorrect.
So far, we have discussed PSR-Large #1 as a small example task. Perhaps a simpler example task to use is PSR-Middle #2. Bothe the base and head of the branch have 2 expansions before the last jump, but they different in the evaluations before the last jump, which should also be deterministic (15 for head vs. 14 for base).
These numbers remain the same when addressing the bug (missing "copy") that we found in the simplify function. The base code prunes 2 out of 28 operators, while the head code pruned 0 out of 28 operators.
|
msg9577 (view) |
Author: salome |
Date: 2020-07-10.07:48:36 |
|
New experiment results:
https://ai.dmi.unibas.ch/_tmp_files/simon/issue453-v4.1.html#summary
Looks mostly good to me. I find it a bit curious that we loose noticeably more tasks to memory issues in configs issue453-v4 (without max) in blind and lama-first but overall coverage looks good to me (except maybe lama?). Loosing more tasks due to time with max-layers probably has to do with how the axioms are evaluated in axioms.cc (where it is good to have a big last layer).
|
msg9576 (view) |
Author: salome |
Date: 2020-07-10.01:11:31 |
|
Live is not going to work for me tomorrow morning but feel free to discuss it without me. I added Augusto, Jendrik, Gabi and Malte as collaborators to my repository, I hope this enables you to commit to it.
If there is anything urgent I should do tomorrow morning you can contact me on threema. I can also use my mobile internet as a hotspot, but with a 2GB data limit I'd rather avoid video chats ;).
|
msg9574 (view) |
Author: salome |
Date: 2020-07-10.00:52:38 |
|
Thanks you two! The explanation on necessary literals helps; I have however no intuition if negated axioms that were created because a negated literal is useful in this sense can have any positive impact to any heuristic. But since it does not affect correctness whether or not we compute them I'd say we just leave the relevance analysis as it is.
The experiments ran through, but it seems the code still had a bug:
https://ai.dmi.unibas.ch/_tmp_files/simon/issue453-v4.html
We compute wrong plans or wrongly identify tasks unsolvable in philosophers, optical telegraphs and layers (one of the files from the issue tracker). The issue is with the layer computations, derived variables seem to have wrong layers (the layers in output.sas are different compared to v3). According to my local tests it seems that we need to reverse the clusters before computing the layers. I'm however not sure whether this makes sense since I don't understand in which order the sccs algorithm returns the clusters.
I restarted the experiments with this change in the hopes that we get usable results tomorrow morning. I will address your comments in the pull request tomorrow and hope that we don't need new experiments after that :).
|
msg9573 (view) |
Author: malte |
Date: 2020-07-10.00:48:48 |
|
I've gone over axiom_rules.py, not as thoroughly as would be ideal, but enough for some comments. However, the github diff for this file is unhelpful, so I didn't find a good way to leave comments there. Should we discuss these live?
Most of the comments I have go into the direction of more idiomatic Python, for example using defaultdict and set/dict/list comprehensions more frequently. These are mostly small, local things. There are one or two places where I'd be interested in runtime measurements, and there is one thing in "simplify" where I wonder if there may be a bug.
|
msg9572 (view) |
Author: malte |
Date: 2020-07-09.23:40:13 |
|
I did a bit of reviewing and tried to answer the question regarding the necessary literals, but I'm not sure how much it helps. Will have another closer look at axiom_rules.py later, but I first have to finish some other TODOs.
|
msg9571 (view) |
Author: jendrik |
Date: 2020-07-09.21:43:11 |
|
I left some comments about the C++ part of the code.
|
msg9566 (view) |
Author: salome |
Date: 2020-07-09.16:58:53 |
|
We finished cleaning up the code and adding comments in order to make it more readable. In my opinion it is ready for another code review.
I also forked from our brand new git repo and recreated a (mostly squashed) history, basically one commit for each version we ran experiments for, plus the current state of Augusto's repository (and an additional push that reorders the functions in axioms.py). I created a pull request there:
https://github.com/salome-eriksson/downward/pull/1
There is one question left from my side: I don't understand in our relevance analysis why we say "if L is necessary and L' is in the conditional effect of L for some operator, then the negation of L' is necessary". It was like this in the original code (but only for delete effects since we were only interested in positive literals) and we discussed this with Malte over a year ago but I don't remember anymore :(.
|
msg9528 (view) |
Author: malte |
Date: 2020-07-07.16:40:00 |
|
regarding point 1:
It would make sense to me that this might follow from always including the positive form of axioms. If I remember correctly, we used to have the "false" default whenever we included the positive form of axioms. And now we do that always.
It doesn't necessarily have an impact on the output file. It means the default value is now redundant, but it also doesn't hurt. So I would say there is future simplification potential, but not necessarily immediate need for action.
I think it would be good if the code in the translator were amended to reduce future confusion though, for example by removing axiom_init altogether, or documenting that it is always empty (and should be removed in the future) and verifying that with an assertion.
regarding point 2:
I cannot meaningfully comment at the moment, but if this is inherited from the old code, it sounds plausible to me that this was just redundant. It's also conceivable that some of the transformations that happen in the code affect (or used to affect the SCCs), for example if something is pruned by the relevance analysis which can then break an SCC into multiple ones. (But something that used to be in different SCCs should certainly never end up in the same SCC in the second computation, so this would only change in one direction.)
|
msg9527 (view) |
Author: augusto |
Date: 2020-07-07.16:32:44 |
|
We made some changes in the code (PR: https://github.com/aibasel/augusto-downward/pull/1 ) but we expect to change it even more until tomorrow, so no need to review it for now.
There are two issues that we would like to discuss:
1. The axiom_init variable that is returned from handle_axioms seems to always be an empty list. Is this expected? If yes, this probably changes the semantics of the SAS file, right?
2. The sccs.py returns the clusters in (reversed?) topological order but later on in the code, when computing the axiom layers, we compute the topological order again. What is the reason we need to recompute it?
Besides that, we would like to add more meaningful comments to the code.
|
msg9518 (view) |
Author: salome |
Date: 2020-07-07.11:59:30 |
|
We figured out that the tank in performance stems from doing a multitude of membership tests on a list rather than a set. We decided to go forward with a code review, and after this is done and the code is adjusted we will run another round of experiments.
The pull request can be found here(same as below):
https://github.com/aibasel/augusto-downward/pull/1
|
msg9492 (view) |
Author: augusto |
Date: 2020-07-06.18:26:57 |
|
Yep, you are completely right. I messed up all the numbers. I will have a closer look at it tomorrow as well.
|
msg9491 (view) |
Author: malte |
Date: 2020-07-06.18:09:53 |
|
Unless I looked at the data wrong, lama-first solves 35 of those on which we now fail.
Also, looking at translator_time_done, it looks like we're now more than a factor of 10 slower in the translator -- I think we probably need to look at this.
|
msg9490 (view) |
Author: augusto |
Date: 2020-07-06.18:06:09 |
|
If I remember correctly, airport and psr-large are the domains where the DNF explosion is problematic. Isn't it the issue with the translator-out-of-memory tasks? I assume that our new implementation would have even larger issues with this.
Also, it seems that most of these are not solved by any configuration anyway. Most of them fail during the search component in the old base configurations.
|
msg9485 (view) |
Author: malte |
Date: 2020-07-06.16:32:10 |
|
The time difference doesn't look too drastic for me (except for the max-layers versions), but warrant a second look. Roughly speaking (and with some caveats), a constant speedup factor translates to a fixed additive difference in score_total_time for each task, and the score are added up across tasks.
|
msg9484 (view) |
Author: salome |
Date: 2020-07-06.16:21:59 |
|
Results are in, don't look too good:
http://ai.dmi.unibas.ch/_tmp_files/simon/issue453-v3.html
We are loosing 50 tasks due to the translator running out of time, happening in airport and psr-large. I already narrowed it down to the AxiomDependencies class; I'm guessing it has to do with the fact that we use OrderedDicts but I will need to look closer into it. If someone wants to help feel free to join :).
Otherwise we're also loosing a bit of search performance, doesn't look too tragic in my eyes but I'm not too familiar how the score_search_time attribute should be interpreted.
|
msg9394 (view) |
Author: malte |
Date: 2020-07-01.21:04:28 |
|
Salomé, Augusto, Gabi and I have discussed this issue on Zoom.
(Please extend/correct the following as necessary.)
We decided to go for:
- layer strategy: we will keep min and max as options, with min as default
- overapproximate negated axioms: always overapproximate cycles
- overapproximate necessary literals: always consider all non-derived literals relevant
- keep redundant axioms: always keep the positive form of axioms
Salomé and Augusto will update the pull request to reduce it to these options. Since we are close to merging this, I suggest also rebasing the code to be based on the current version of the default branch if that is not already the case.
Once this is done, we can do a code review.
We can then also rerun a new experiment with three versions: old, new with min option, new with max option. In terms of planner configurations, we want to cover astar(blind()), eager greedy and lazy greedy options with the FF heuristic and preferred operators, and lama-first.
I think we can have the experiments and code review go on concurrently.
We should also test all the relevant problem tasks referenced (indirectly) from issue924 to see if these are now fixed or what their status is.
|
msg9357 (view) |
Author: salome |
Date: 2020-06-29.17:40:31 |
|
Augusto converted his repository to git and created a new pull request on github: https://github.com/aibasel/augusto-downward/pull/1
|
msg9356 (view) |
Author: salome |
Date: 2020-06-29.16:40:48 |
|
Augusto and me just had another look at the code and the experiments. We would recommend to remove all options again (except maybe the layer strategy) and only pick the one which makes the most sense or performed the best:
1. layer strategy: always use min. Min is in most cases faster when using blind search. Our guess is that the successor generator is more efficient when the amount of layers is small, but we are not sure on the details why this is the case yet. We could also keep both options in case that the max option could be useful in the future (I somehow remember that it could be beneficial for Simon's project to have small layers?)
2. overapproximate negated axioms:always overapproxmiate cycles. The experiments showed (as expected) that the more we overapproximate, the worse FF gets. So since we do need to overapproximate cycles in order to get rid of the bug we stick with this.
3. overapproximate necessary literals: There is no clear better option performance wise, so we suggest to choose the option we like most conceptually, since we don't see the advantage of keeping several options.
4. keep redundant axioms: given that the task size does not grow that much we see nothing negative with just keeping the redundant positive axioms. Maybe this could even make the successor generator nicer if we don't need to deal with default-true derived variables?
Aside from these conclusions, we want to investigate two things more thoroughly:
1. Find out why min-layers is better
2. We noticed that the number of evaluations are sometimes different between configurations which should not influence the search. For example, just changing whether or not to keep redundant positive axioms can result in different evaluations in the last f-layer for astar+blind (tested locally, the experiments test lazy-greedy). Our main theory is that the successor generator is maybe not deterministic and depending on the internal representation of the task might for example have different orders in the match tree.
|
msg9197 (view) |
Author: salome |
Date: 2020-02-13.12:33:51 |
|
I ran experiments for all possible configurations of the options detailed in the meta issue. To reiterate we had the following options:
1. layer strategy (how many axiom layers should we create):
a) max
b) min (current implementation)
2. overapproximate negated axioms:
a) all: trivial overapproximation for all literals
b) cycles: trivial overapproximation for literals with cyclic dependencies
c) none: negate all axiom rules exactly (current implementation, leads to bugs)
3. overapproximate necessary literals:
a) exact: do a full relevance analysis
b) non-derived: consider all non-derived literals (both positive and negative) as necessary
c) positive: only consider positive non-derived literals as necessary (current implementation)
4. keep redundant axioms (if a derived variable is only needed negatively, should we keep the positive axiom rules or not?)
a) no (current implementation)
b) yes
I tested both blind and ff search, and also tested the problems uploaded to the various issues mentioned in the meta issue (issue924). The results can be found here:
https://ai.dmi.unibas.ch/_tmp_files/simon/issue453-v2.html
https://ai.dmi.unibas.ch/_tmp_files/simon/issue453-v2-custom-pddls.html
Since the tables are nearly impossible to parse due to the amount of configurations, I also made some plots which try to highlight the influnece of different options:
https://ai.dmi.unibas.ch/_tmp_files/simon/issue453-v2-figures.pdf
Overall I see the following trends:
- Using the minimum amount of axiom layers seems to be faster in general (especially visible for blind search). The only reason I found that could possibly explain this is msg3058. Essentially, the code does not apply negation by failure rules in the last iteration, and I guess if the layer is smaller then we do not save as much time with this optimization. But I am very unsure about details here.
- Overapproximating negated axiom rules for literals in cycles resolves the bug from this issue. It does negatively influence the ff heuristic in some rare cases, but I would say this is a small price to pay for removing a bug .
- Overapproximating all negated axiom rules does very noticeably influence the ff heuristic (no suprises there).
- The type of relevance analysis does not seem to influence the time the translator needs nor overall coverage all that much.
- Keeping redundant positive axioms does not hurt task size significantly, in the worst cases it results in about a 50% bigger task file.
|
msg8893 (view) |
Author: malte |
Date: 2019-06-12.23:16:19 |
|
These rules are not quite complete/correct. For example, the second rule talks
about the effect condition of an operator, but operators don't have effect
conditions; effects do.
It should be something like: If L is necessary and <if cond then L> is a
conditional effect of an operator o (cond may be trivial), then all literals in
pre(o) and cond are necessary.
The last overapproximation rule is probably meant to say that the complement of
L is (overapproximated as) necessary if *L* occurs in any precondition, effect
condition or axiom body; otherwise that rule is identical to the one before.
Note that this overapproximation is a lot weaker than the (buggy, but assuming
that were fixed) overapproximation we currently use; I'm not sure it is
meangingfully different from "everything is relevant". Rather than use this
overapproximation, it is probably a better implementation effort/benefit
trade-off not to compute any notion of "necessary" at all.
|
msg8882 (view) |
Author: augusto |
Date: 2019-06-12.14:47:00 |
|
As discussed today, we are refactoring the code for the axiom rules generation
and one issue is the definition of "necessary literals". When computing the
axiom rules to be used, we only use the rules for which the variable in the head
is necessary.
Our definition of 'necessary literal' is:
- A literal L is necessary if L occurs in the goal
- If a literal L is necessary and L \in eff(o) of some operator O, then every
literal L' \in pre(o) and L' \in effcond(o) is necessary
- If a negated literal ~L is necessary and L \in eff(o) of some operator O,
then every literal L' \in effcond(o) is necessary
- If a literal L is necessary and occurs in the head of some axiom rule A, then
every literal L' occurring in the body of A is necessary
- If a negated literal ~L is necessary and L occurs in the head of some axiom
rule A, then every literal L' occurring in the body of A is necessary
However, using these rules, we need to perform a relevance analysis to figure
out which literals are necessary or not.
Another possibility is to use an approximation of the rules in order to avoid
computing the relevance analysis at this point. The approximation of these rules
would be:
- A literal L is necessary if it occurs in the goal
- A literal L is necessary if it occurs in any precondition, effect condition,
or body of axiom rules
- A negated literal ~L is necessary if it occurs in any precondition, effect
condition, or body of axiom rules
This approximation makes things easier for now, but it might hurt performance
significantly. Ideally, we would like to test both definitions of *necessary
literal* to see which one is preferable.
|
msg8865 (view) |
Author: salome |
Date: 2019-06-07.12:00:57 |
|
We discussed several options today on how a more sophisticated implementation
should look like. In general, we want to restructure the axiom code to use some
clearer data structures, starting with computing an SCC early on which will be
used to (1) decide for which variables we can safely only keep the negative
axioms, and (2) compute the axiom layers at the end.
For (1), we also want to investigate whether it is even beneficial to delete the
original axioms, since we can only do this for variables which are in its own
SCC. (For all other variables, we can add the sophisticated negative axioms, but
are not allowed to delete the original ones and switch the default value.)
For (2), we want to test whether we should aim for a minimal or maximal amount
of layers.
|
msg8851 (view) |
Author: augusto |
Date: 2019-06-06.16:05:17 |
|
Oh, I did not check these issues before.
I will add the ones related to the issue experiments.
Since the grid is full now, I ran a few local tests using the lazy greedy search
with FF and 8GB as memory limit. I call as "normal" the current version of the
axiom rules in the translator and as "overapproximation" rules the one we are
testing. The version using overapproximation of axiom rules solves almost all
problems, while the normal version runs out-of-memory in almost all cases. The
only exception is the "graph domain" where we fail during the computation of the
canonical model. In summary, our method solved all the performance issues of
these domains. In fact, all instances had less than one second of search time
when using the overapproximation.
More detailed results: (just to keep track of what is related and what is not)
- problems in issue165
- the clean cup domain runs out of memory very quick using the normal version,
but it is solved very quick (1 second) using the overapproximation;
- for the blocksworld no clear domain: the small instance (5 blocks) is easily
solved by both versions; while the large one (10 blocks) runs out-of-memory for
our normal version but is solved in less than one second by the
overapproximation version. The same happens for Patrik's
"crash-dummy-domain.pddl" variant of this domain.
- the graph domain runs out-of-memory while computing/instantiating the
canonical model. Thus, it is not related to the axiom problem.
- problems in issue450
- citycar domain: computing the canonical model and instantiating the schemas
already uses ~6GB. Using the normal axiom rules, we run out of memory as soon as
we start simplifying the axioms. Using the overapproximation, the axioms still
take a while to generate. In total, the overapproximation generates 1.177.879
axioms. However, FF reports it as unsolvable. Are we sure this instance is
solvable? I ran it with goalcount and the blind heuristic and it also did not
find a solution. If it is solvable, then we have a bug in our code.
- mst: runs out of memory using the normal version; solved very quickly using
the overapproximation.
- problems in issue533
- the instances from this issue did not produce any error in neither version.
|
msg8849 (view) |
Author: malte |
Date: 2019-06-06.15:38:20 |
|
These are good points/questions. Perhaps have a brief discussion about this?
More generally, our current collection of issues related to this is a bit of a
mess; it may help to consolidate it a bit.
|
msg8848 (view) |
Author: salome |
Date: 2019-06-06.15:21:31 |
|
Thanks Augusto!
To my understanding, the next step would be to only use the trivial
overapproximation when there are cycles containing negative derived literals. I
have however a few questions about implementation details:
1. The best place to look for cycles would be when creating the axiom layer.
However, this happens only after we make the decision to overapproximate or not.
The easiest solution is to simply calculate the dependency graph twice, once in
the beginning when we want to see if cycles exist, and later when creating the
layers. But in msg8204 Malte mentioned that it might be a good idea to compute
the layers much earlier. Plus I think there was also somewhere an argument to
change the layer computation from creating as few layers as possible to create
as many as possible?
My suggestion would be to open a new issue for possible changes to the layer
computation, and for this issue stick with simple solution (unless performance
suffers too much).
2. As Augusto mentioned we currently consider *all* literals necessary when
using the overapproximation. We do this because axioms for unnecessary literals
get discarded. In the normal setting, if a derived variable is only necessary in
its negative form, the "original" axioms get discarded and instead axioms for
its negation get introduced. But since the overapproximation setting only
introduce an overapproximation for the negative form, we would loose information
by discarding the "original" axioms. Would a correct behaviour be to stick with
the computation of necessary literals, but always add both the literal and its
negation?
|
msg8847 (view) |
Author: malte |
Date: 2019-06-06.15:19:09 |
|
There are also instances attached to issue165, issue450, issue533 and issue862
that may be related. I think it would make sense to consolidate them in one
place and include them in the issue experiments. (I'm not saying all of these
are related, but all of them may be.)
|
msg8846 (view) |
Author: augusto |
Date: 2019-06-06.15:14:10 |
|
I assume that you are talking about the instances posted by Patrik in this issue
and posted by Gabi in issue862, right?
If yes, then both instances are correctly solved when using the
overapproximation (but the one by Patrik is obviously still not correct when not
using it.)
|
msg8845 (view) |
Author: malte |
Date: 2019-06-06.15:00:49 |
|
So far, so good!
Can you test the benchmarks that are linked to this issue and the related issues
where our code currently behaves incorrectly, and see if the change fixes the
problem?
I'm reasonably confident we can get performance back up by addressing the issues
with simplification/relevance analysis and by not overapproximating in cases
where the exact negated rules can be computed with reasonable effort.
The only domain where there may be conceptually challenges ahead is PSR, but for
now I'm mildly optimistic.
|
msg8844 (view) |
Author: augusto |
Date: 2019-06-06.14:54:30 |
|
The results are in
http://inf.ufrgs.br/~abcorrea/_issues/issue453/issue453-v1.html
(Sorry for the sciCORE error messages. But my module loads seem correct --
assuming that Jendrik's are also correct :-)
I ran the experiments proposed by Malte in msg8235. "normal" indicates the
current method used by Fast Downward, "overapprox" is using the option to
trivially overapproximate axiom rules as described in msg8795.
I ran an initial experiment to filter out instances where we generated at least
one axiom, thus I am only reporting on those instances where axioms are produced
but, for some reason, I removed airport-adl. I am running a new experiment only
with airport-adl. It ends up that all other domains are the same reported by
Malte in msg8235.
The overapproximation seems to be very harmful in general. It increases the
number of axioms significantly (as expected) but it also makes the heuristic
estimates much worse (check the initial h-values). It is interesting to see that
the causal graph heuristic (cg) is significantly more informed than the
delete-free heuristics in a few domains when using the overapproximation approach.
We also have an instance of psr-large where the translator runs out-of-memory in
our overapproximation, which is very bad since the "normal" version multiplies
out the CNF and still doesn't run out-of-memory. The new approach is, in
general, significantly slower than the "normal" one.
One possible issue with our approach is that we create axioms for every single
literal instead of just using the literals which are necessary (i.e., that
appear in a precondition or in the goal).
|
msg8819 (view) |
Author: malte |
Date: 2019-06-05.19:33:03 |
|
As discussed offline: msg8235 lists the axiom-related search plugins that I
think should be tested. I may have missed something when I compiled the list,
but I think it should give us decent enough coverage.
|
msg8810 (view) |
Author: augusto |
Date: 2019-06-05.15:07:29 |
|
We finished the implementation of the trivial overapproximation as an option for
the translation. We will run some experiments using the causal graph heuristic,
the delete-relaxation heuristics, and the landmark heuristics to check whether
they work fine with it. I already noticed that the Zhu and Given landmark
factory is not safe when used ADL domains. (We probably should state it more
clearly in the wiki.)
Malte, is there any specific configuration that you want us to test?
I thought about: context-enhanced additive heuristic, causal graph heuristic, FF
heuristic, LM-count heuristic with RHW landmark factory, blind heuristic, and
some other heuristic which does not rely on axioms (e.g., iPDB). Am I missing
some obvious configuration that we would like to test?
|
msg8795 (view) |
Author: malte |
Date: 2019-06-04.11:11:10 |
|
We want to work with two representations for the axioms. Firstly, the "actual"
true representation that is used by the axiom evaluator. Unlike currently, we
want this to be the original unadultered axioms, i.e., not reformulate some of
them from positive to negative and change the default value. All default values
will be the negative literals, all rules will derive a positive literal. Unlike
currently, in this representation we have no explicit representations of under
which conditions we get the negated value, only the implicit one "we get it if
we don't derive the positive one".
For the heuristics, we need to tell them something about the conditions under
which we can get the negative values. Previously, we had the reformulated rules
for this. But we can also use any overapproximation. For now, we will try the
trivial overapproximation where we use "fake rules" of the form "=> not x"
(empty left-hand side) to tell the heuristics "we can get not x unconditionally".
Some parts of the planner need the exact representation, some parts need
explicit rules for the negative values, but can use an overapproximation for now.
|
msg8790 (view) |
Author: augusto |
Date: 2019-06-03.17:06:25 |
|
We are back to this issue. We decided to implement the overapproximation, as
explained by Malte in msg8047 but generating these rules for every literal
occurring in some axiom rule. In this way, we can solve the problem caused by
the cyclic dependency of derived predicates, and we do not break the
implementation of the delete-free heuristics nor the causal graph heuristic. I
will create a branch in my repo for this issue.
Besides the modification in axiom_rules.py, we also need to adapt the search
component (axiom.cc) to avoid possible "non-determinism" when evaluating the
axioms with this overapproximation. Basically, whenever two axioms are triggered
with different values for the same derived predicate, we should set the value of
this predicate to its default value.
Another issue is regarding the relevance analysis performed by the translator
after the generation of axiom rules. We must guarantee that no information is
lost due to the overapproximation of the axiom rules.
|
msg8206 (view) |
Author: malte |
Date: 2018-12-08.13:24:43 |
|
Add a pointer to msg8204 to the summary.
|
msg8047 (view) |
Author: malte |
Date: 2018-11-04.07:05:43 |
|
I've thought about this some more, and I want to write down the outcomes of this
thought process before I forget about it again. What we do with this in order to
fix this issue is a separate question.
The problem is the following: we have a set of axioms with negation as failure
semantics. We have a set of primary state variables P = {p_1, ..., p_m} and a
set of derived state variables D = {d_1, ..., d_n}. Let's assume that all state
variables are binary and that all derived variables are false unless derived
otherwise. (In Fast Downward it's a bit more complicated, but conceptually it's
easy to bring things into this form.)
So all derivation rules look like this:
l_1, ..., l_k -> d_h
where each l_i is one of "p_j", "\neg p_j", "d_j" or "\neg d_j" for some index j.
We have the usual stratification constraints: we can assign a level to every
derived state variable such that
level(d_i) <= level(d_j)
whenever d_i occurs in the body of a rule with head d_j, and
level(d_i) < level(d_j)
whenever \neg d_i occurs in the body of a rule with head d_j.
We would ideally like to come up with an equivalent set of rules that are
expressed in terms of the negations of the d_i. That is, we want to introduce a
new set of variables \hat{d_i} and express new derivation rules in terms of
these variables instead of the hatless d_i variables such that when we evaluate
the rules, \hat{d_i} is true iff d_i is false.
Note that there is some freedom in how to assign the levels. We currently group
as many variables as possible into the same level. An alternative, which would
be more useful here and no harder to compute, would be to only group variables
in the same level if we absolutely have to, i.e., if we have a cyclic set of
constraints
level(d_{i_1}) <= level(d_{i_2}) <= ... <= level(d_{i_n}) <= level(d_{i_1})
that forces all these levels to be the same. So let's assume we do that.
If all variables had different levels, the hat rules would be easy to define.
The rules for d_i would look like this:
l_{1,1}, ..., l_{1,k_1} -> d_i
...
l_{s,1}, ..., l_{s,k_s} -> d_i
where all literals l_{i,j} refer to primary state variables \neg d_j where
level(d_j) < level(d_i). (We can rule out the case where d_i itself occurs in
the body, which is strictly speaking allowed, by pruning such rules because they
can never fire unless d_i was already derived, in which case they don't
contribute anything.)
With an inductive argument, we can assume that we already know how to construct
rules for the \hat{d_j} of lower level, and in general we have that d_i is true
iff the DNF formula
(l_{1,1} and ... and l_{1,k_1})
or ... or
(l_{s,1} and ... and l_{s,k_s})
is true, which means that d_i is false iff the formula
(\neg l_{1,1} or ... or \neg l_{1,k_1})
and ... and
(\neg l_{s,1} or ... or \neg l_{s,k_s})
is true, which we can easily express as a CNF formula over the variables p_j and
\hat{d_j}.
So the problematic cases are only the ones where we have a cyclic dependency. In
a larger set of rules where some variables are at the same level and some are
not, we can deal with each of the cycles separately if we can come up with an
exact representation for the hatted variables.
If we can only come up with approximations of the \hat{d_i} variables (over or
under) locally for one cycle and we want to combine them into an approximation
(over or under) globally, I think we need to maintain *both* an over- and an
underapproximation because variables from earlier levels can be used in a
negated or non-negated way, and negation turns over- into underapproximations
and vice versa.
I think it should be possible to come up with an exact solution with a
polynomial blowup, which is probably too large for many of our benchmarks, but I
think might still be interesting from a theoretical perspective. More precisely,
the blowup would be by a factor of M, where M is the maximal number of derived
variables in the same layer (dependency cycle). The idea is to unroll the
variables over time to get rid of dependency cycles and observe that with M
variables in one cycle, we never need more than M time steps until we cannot
make any further derivations.
Consider Patrik's example from this issue. For consistency with the above,
rename the primary variable "P(bottom)" as p_1 and the derived variables Q(left)
as d_1, Q(right) as d_2 and Q(top) as d_3, our derivation rules are:
\neg p_1 -> d_1
\neg p_2 -> d_2
d_1 -> d_3
d_2 -> d_3
d_3 -> d_1
d_3 -> d_2
The last four rules force all d_i into the same level.
There are three variables, so let's create three copies d_{i,0}, ..., d_{i,2}
for each of them. We then replace the original rules by one or more copies as
follows:
If the LHS contains no variables from the cycle, create only one copy, where the
RHS variable gets time step 0.
Otherwise, create M-1 copies where the LHS has time steps {0, ..., M-2}, and the
RHS has the next time step. We also create "no-op" rules that allows us to
propagate the truth of a variable to the next time step. This gives us the
following new rules:
from the rules not involving cycle variables on the LHS:
\neg p_1 -> d_{1,0}
\neg p_2 -> d_{2,0}
from the rules involving cycle variables on the LHS:
d_{1,0} -> d_{3,1}
d_{1,1} -> d_{3,2}
d_{2,0} -> d_{3,1}
d_{2,1} -> d_{3,2}
d_{3,0} -> d_{1,1}
d_{3,1} -> d_{1,2}
d_{3,0} -> d_{2,1}
d_{3,1} -> d_{2,2}
no-op rules:
d_{1,0} -> d_{1,1}
d_{1,1} -> d_{1,2}
d_{2,0} -> d_{2,1}
d_{2,1} -> d_{2,2}
d_{3,0} -> d_{3,1}
d_{3,1} -> d_{3,2}
This new ruleset has no cycles any more, and we should be able to prove that we
can derive d_{i,M-1} iff we can derive d_i in the original set of rules.
Therefore we can use d_{i,M-1} in place of d_i in all later layers (where it is
treated like a primary variable).
Because the new rule set has no cycles, we can use it to compute the hatted
variables.
As I said, the blow-up is probably unacceptable in many cases, so let's also
briefly look at approximate solutions. I'll only talk about how to over- and
underapproximate on one cycle and from that we should be able to
over-/underapproximate for the whole set of rules. We will need separate
variables for the over/underapproximation, but let's ignore this here because I
only want to talk about one layer.
Let's return to our original set of rules:
\neg p_1 -> d_1
\neg p_2 -> d_2
d_1 -> d_3
d_2 -> d_3
d_3 -> d_1
d_3 -> d_2
If we want to overapproximate on this layer, a simple way is to replace all
occurrences of cycle variables on the LHS as "True"; to underapproximate, we
could use "False". For the overapproximation, this would give us:
\neg p_1 -> d_1
\neg p_2 -> d_2
True -> d_3
True -> d_3
True -> d_1
True -> d_2
This is quite drastic: it means that d_1, d_2, d_3 are always true, so it
trivializes and loses all information. The underapproximation is less drastic.
Having False on the LHS means dropping the rule entirely, so we get:
\neg p_1 -> d_1
\neg p_2 -> d_2
Which means that d_3 would always be False (an underapproximation), but at least
d_1 and d_2 would get their correct truth values. (The correct semantics for
this particular set of rules is that d_1, d_2, d_3 are all equivalent to (\neg
p_1 or \neg p_2.)
These approximations have the effect of breaking all the cycles, so we can again
compute (approximated) hatted versions based on these new rules.
But there is no need to be so drastic. We don't need to replace *all*
occurrences of cycle variables on the LHS by True (False) to over- (under-)
approximate. It is enough to replace sufficiently many of them to break all the
cycles.
For example, for the underapproximation, let's say we only replace the LHS
occurrences of d_3 with False, but not the ones of d_1 and d_2. Then we get:
\neg p_1 -> d_1
\neg p_2 -> d_2
d_1 -> d_3
d_2 -> d_3
This is acyclic, so we can compute the negation based on this, and it is
actually perfect. However, for the overapproximation, unless I overlooked
something, we cannot gain anything in this example by using True on the LHS
selectively: any way of breaking the cycles will make the rules trivial.
Of course the over/underapproximation and unrolling ideas can also be combined.
If we have a very large cycle, we could try to approximate it a bit to break it
into a smaller cycle, which we can then unroll.
That's all I have. I hope I didn't make a major mistake that makes the whole
concept invalid. If this works, it's probably well known in logic
programming/answer set programming/etc. circles, and it would be good to talk to
an expert in this area to find out if they know more about this problem, e.g. if
they have better approximations or more compact ways to compute the exact result.
|
msg8040 (view) |
Author: malte |
Date: 2018-11-04.04:09:51 |
|
The proposed fix for issue862 interacts with this issue.
I've added code comments to axiom_rules.py which explain the interaction. All
relevant comments mention the keyword "issue453", so searching for this keyword
in axiom_rules.py while working on this issue should be enough to find all of
them. I've added a reminder of this to the summary.
|
msg3332 (view) |
Author: patrik |
Date: 2014-08-21.03:10:16 |
|
I updated the alternative translator to the current sas-file format, and it
seems to work. Attaching the translated file (.sas) and the result of running
the preprocessor on this file (.pre). Running A*/blind with the resulting
preprocessed file produces the expected plan.
|
msg3323 (view) |
Author: malte |
Date: 2014-08-18.11:06:54 |
|
> Just to clarify: This is only a translator issue? In other words, provided a
> "direct" translation of axioms (i.e., just grounding and simplifying them to
> the required form), the basic search framework (successor generator, etc)
> should work?
Possibly; I'm not entirely sure. It could be the case that some of the later
code relies on the presence of the negated axioms for relevance analysis, i.e.,
the preprocessor might wrongly determine that a variable is irrelevant if the
negated axiom rules are absent. I hope I'll be able to look into this on Wednesday.
|
msg3322 (view) |
Author: patrik |
Date: 2014-08-18.04:58:24 |
|
Just to clarify: This is only a translator issue? In other words, provided a
"direct" translation of axioms (i.e., just grounding and simplifying them to the
required form), the basic search framework (successor generator, etc) should work?
From our perspective, the best (short-term) fix may be to dust off one of my
alternative translators and bring it up to date with the current FDR format.
|
msg3321 (view) |
Author: malte |
Date: 2014-08-17.11:36:20 |
|
I opened a separate issue for getting rid of the negated axiom rules (issue454),
as I'm not sure this is the only thing going on here, but that bit will be quite
a bit of work by itself. (Didn't add you to the nosy list, so please do if you
want to follow this.)
|
msg3319 (view) |
Author: malte |
Date: 2014-08-17.09:12:59 |
|
Agreed. It's also a question of the relative amount of work and time available
in the immediate future, though -- it may be quite a bit less work to fix the
current implementation without changing the architecture.
|
msg3318 (view) |
Author: patrik |
Date: 2014-08-17.03:15:11 |
|
Thanks for the detailed explanation!
So, a quick workaround would be to trick the translator into believing that the
positive of the derived Q predicate is relevant. I tried this, by adding a dummy
operator identical to the existing set-p except it has an extra parameter ?y and
the precondition (secondary-q ?y). This problem is indeed solvable :)
I don't think it's for me to tell you how to solve the problem, but it seems
that keeping the translation mechanism simple and correct (and, if necessary,
computing the reformulated axiom set on demand, storing it separately, and using
where it is helpful) would be a more modular solution. Working out the
semantically correct way of reasoning about negative axioms - particularly in a
relaxed setting - is indeed interesting (in fact, that's the reason we're now
looking at these problems...) but also possibly quite tricky. I'm not sure if it
can be done in standard logic, or if it requires something else (like
well-founded semantics/answer-set programming, etc).
|
msg3317 (view) |
Author: malte |
Date: 2014-08-16.10:27:43 |
|
Thanks, Patrik! I think this is a conceptual error in the translator.
I'll try to explain what I think is going on. I'm adding Gabi because
we recently discussed this part of the translator.
The grounded problem has the following structure:
one non-derived fact P(bottom), initially false.
three derived facts Q(left), Q(right), Q(top) with derivation rules:
not P(bottom) => Q(left)
not P(bottom) => Q(right)
Q(left) => Q(top)
Q(right) => Q(top)
Q(top) => Q(left)
Q(top) => Q(right)
one action:
PRE: nothing; EFF: P(bottom)
goal:
not Q(top)
Solving the problem requires relying on negation by failure, i.e.,
finding a state in which "not Q(top)" is *not* derived. The way that
the above derivation rules are phrased is "indirect" in the sense that
they don't directly tell us how to derive the literal "not Q(top)".
They only tell us how to derive the complementary literal "Q(top)".
For mostly historical reasons (this is not used by most heuristics),
the translator attempts to rephrase this in a more "active" way, by
giving derivation rules for "not Q(top)" instead of "Q(top)". This is
described briefly in the Fast Downward paper (but not the translator
paper) in Section 5.1.
The basic idea of this rephrasing is as follows: we can try to think
of derivation rules as defining a condition in DNF. For example, let's
say we have:
A and B => X
A and C => X
B and D => X
We can view this as X being defined by
(A and B) or (A and C) or (A and D).
The condition for "not X", then, is the negation of this formula:
not ((A and B) or (A and C) or (A and D))
which we can then convert to DNF again if we want.
The problem with this is that this simple "negate the CNF" view of
derivation rules is problematic -- or at least we have to be much more
careful about it -- if there are cycles in the derivation rules. In
the task that is causing the problem, we can write the "CNF
definitions" as:
Q(left) if not P(bottom) or Q(top)
Q(right) if not P(bottom) or Q(top)
Q(top) if Q(left) or Q(right)
From this the translator generates the negated rules:
not Q(left) if P(bottom) and not Q(top)
not Q(right) if P(bottom) and not Q(top)
not Q(top) if not Q(left) and not Q(right)
and detects that it only ever needs the negated versions of Q(...) to
satisfy preconditions, goals or conditions in derivation rules. It
then decides that it should use *these* rules as the actual derivation
rules and use the *positive* literals as the negation-by-failure
fallback values. That is, to evaluate derived variables in a state, it
first assumes that all P(...) are *true* and uses the rules above to
try to derive *false* literals. Due to the cyclic dependency (deriving
"not Q(top)" requires "not Q(left)", which in turn requires "not
Q(top)"), this won't work.
I see two main possibilities for how to proceed here:
1. Think about what the proper semantics for these negated axioms
should be in case of cycles and implement this.
2. Rip the negated axiom things out completely, as it is hardly used.
(I think the causal graph heuristic benefits from it, but this
might be about it.)
|
msg3316 (view) |
Author: patrik |
Date: 2014-08-16.03:28:28 |
|
FD reports attached problem unsolvable, but it has a valid plan. It seems that
the cause of this is an incorrect computation of the derived predicates. In the
second state (after applying the single action in plan), all the derived
predicates are true, which they should not be. (Also attaching fd-val, which I
used to check this; unfortunately, it doesn't print anything about axiom
evaluation.)
I have not checked if the axioms in the translated SAS file are correct.
|
|
Date |
User |
Action |
Args |
2020-07-22 13:40:14 | salome | set | status: chatting -> resolved messages:
+ msg9667 |
2020-07-20 12:50:55 | malte | set | messages:
+ msg9658 |
2020-07-17 17:53:54 | salome | set | messages:
+ msg9650 |
2020-07-17 17:20:01 | malte | set | messages:
+ msg9647 |
2020-07-17 11:04:03 | malte | set | messages:
+ msg9645 |
2020-07-16 22:34:23 | salome | set | messages:
+ msg9644 |
2020-07-16 21:40:56 | malte | set | messages:
+ msg9643 |
2020-07-16 19:09:57 | salome | set | messages:
+ msg9642 |
2020-07-16 18:50:40 | malte | set | messages:
+ msg9641 |
2020-07-16 12:40:10 | salome | set | messages:
+ msg9636 |
2020-07-16 11:08:47 | salome | set | messages:
+ msg9634 |
2020-07-15 20:23:58 | malte | set | messages:
+ msg9631 summary: See meta-issue: issue924.
See msg8204 of issue862 for an overview of how the axiom code currently works.
(May become out of date while we're working on the axiom code further.)
Check for references to "issue453" in the code, specifically axiom_rules.py.
TODOs:
1. Have a translator option that implements the new strategy with the trivial
overapproximation. Make sure that relaxed reachability and relevance analysis
play nicely with this. Make sure that the axiom evaluator in the search code
ignores the fake rules. (In general, to make this work also with the old
behaviour, we can ignore all rules where the head is the default value of the
variable.)
2. Later, we probably want to try more sophisticated things, e.g. don't
overapproximate if there are no cycles. -> See meta-issue: issue924.
See msg8204 of issue862 for an overview of how the axiom code currently works.
(May become out of date while we're working on the axiom code further.)
~~Check for references to "issue453" in the code, specifically axiom_rules.py.~~
TODOs:
~~1. Have a translator option that implements the new strategy with the trivial overapproximation. Make sure that relaxed reachability and relevance analysis play nicely with this. Make sure that the axiom evaluator in the search code ignores the fake rules. (In general, to make this work also with the old behaviour, we can ignore all rules where the head is the default value of the variable.)~~
~~2. Later, we probably want to try more sophisticated things, e.g. don't overapproximate if there are no cycles.~~ |
2020-07-15 15:04:18 | malte | set | messages:
+ msg9620 |
2020-07-15 13:36:32 | malte | set | messages:
+ msg9618 |
2020-07-15 13:19:03 | malte | set | messages:
+ msg9617 |
2020-07-15 12:16:38 | augusto | set | messages:
+ msg9616 |
2020-07-15 10:26:43 | salome | set | messages:
+ msg9614 |
2020-07-13 18:36:04 | malte | set | messages:
+ msg9605 |
2020-07-13 13:51:45 | salome | set | messages:
+ msg9602 |
2020-07-12 16:26:13 | malte | set | messages:
+ msg9595 |
2020-07-12 12:19:15 | malte | set | messages:
+ msg9594 |
2020-07-10 07:48:36 | salome | set | messages:
+ msg9577 |
2020-07-10 01:11:31 | salome | set | messages:
+ msg9576 |
2020-07-10 00:52:38 | salome | set | messages:
+ msg9574 |
2020-07-10 00:48:48 | malte | set | messages:
+ msg9573 |
2020-07-09 23:40:14 | malte | set | messages:
+ msg9572 |
2020-07-09 21:43:11 | jendrik | set | messages:
+ msg9571 |
2020-07-09 16:58:53 | salome | set | messages:
+ msg9566 |
2020-07-07 16:40:01 | malte | set | messages:
+ msg9528 |
2020-07-07 16:32:44 | augusto | set | messages:
+ msg9527 |
2020-07-07 11:59:30 | salome | set | messages:
+ msg9518 |
2020-07-06 18:26:57 | augusto | set | messages:
+ msg9492 |
2020-07-06 18:09:53 | malte | set | messages:
+ msg9491 |
2020-07-06 18:06:09 | augusto | set | messages:
+ msg9490 |
2020-07-06 16:32:10 | malte | set | messages:
+ msg9485 |
2020-07-06 16:21:59 | salome | set | messages:
+ msg9484 |
2020-07-01 21:04:28 | malte | set | messages:
+ msg9394 |
2020-06-29 17:40:32 | salome | set | messages:
+ msg9357 |
2020-06-29 16:40:48 | salome | set | messages:
+ msg9356 |
2020-02-13 12:33:51 | salome | set | messages:
+ msg9197 |
2019-06-14 14:29:41 | malte | set | summary: See msg8204 of issue862 for an overview of how the axiom code currently works.
(May become out of date while we're working on the axiom code further.)
Check for references to "issue453" in the code, specifically axiom_rules.py.
TODOs:
1. Have a translator option that implements the new strategy with the trivial
overapproximation. Make sure that relaxed reachability and relevance analysis
play nicely with this. Make sure that the axiom evaluator in the search code
ignores the fake rules. (In general, to make this work also with the old
behaviour, we can ignore all rules where the head is the default value of the
variable.)
2. Later, we probably want to try more sophisticated things, e.g. don't
overapproximate if there are no cycles. -> See meta-issue: issue924.
See msg8204 of issue862 for an overview of how the axiom code currently works.
(May become out of date while we're working on the axiom code further.)
Check for references to "issue453" in the code, specifically axiom_rules.py.
TODOs:
1. Have a translator option that implements the new strategy with the trivial
overapproximation. Make sure that relaxed reachability and relevance analysis
play nicely with this. Make sure that the axiom evaluator in the search code
ignores the fake rules. (In general, to make this work also with the old
behaviour, we can ignore all rules where the head is the default value of the
variable.)
2. Later, we probably want to try more sophisticated things, e.g. don't
overapproximate if there are no cycles. |
2019-06-12 23:16:19 | malte | set | messages:
+ msg8893 |
2019-06-12 14:47:00 | augusto | set | messages:
+ msg8882 |
2019-06-07 12:00:57 | salome | set | messages:
+ msg8865 |
2019-06-06 16:05:17 | augusto | set | messages:
+ msg8851 |
2019-06-06 15:38:20 | malte | set | messages:
+ msg8849 |
2019-06-06 15:21:32 | salome | set | messages:
+ msg8848 |
2019-06-06 15:19:09 | malte | set | messages:
+ msg8847 |
2019-06-06 15:14:10 | augusto | set | messages:
+ msg8846 |
2019-06-06 15:00:49 | malte | set | messages:
+ msg8845 |
2019-06-06 14:54:30 | augusto | set | messages:
+ msg8844 |
2019-06-05 19:33:03 | malte | set | messages:
+ msg8819 |
2019-06-05 15:07:29 | augusto | set | messages:
+ msg8810 |
2019-06-04 11:11:10 | malte | set | nosy:
+ jendrik messages:
+ msg8795 summary: See msg8204 of issue862 for an overview of how the axiom code currently works.
(May become out of date while we're working on the axiom code further.)
Check for references to "issue453" in the code, specifically axiom_rules.py. -> See msg8204 of issue862 for an overview of how the axiom code currently works.
(May become out of date while we're working on the axiom code further.)
Check for references to "issue453" in the code, specifically axiom_rules.py.
TODOs:
1. Have a translator option that implements the new strategy with the trivial
overapproximation. Make sure that relaxed reachability and relevance analysis
play nicely with this. Make sure that the axiom evaluator in the search code
ignores the fake rules. (In general, to make this work also with the old
behaviour, we can ignore all rules where the head is the default value of the
variable.)
2. Later, we probably want to try more sophisticated things, e.g. don't
overapproximate if there are no cycles. |
2019-06-03 17:06:25 | augusto | set | nosy:
+ salome, augusto messages:
+ msg8790 |
2018-12-08 13:24:43 | malte | set | messages:
+ msg8206 summary: Check for references to "issue453" in the code, specifically axiom_rules.py. -> See msg8204 of issue862 for an overview of how the axiom code currently works.
(May become out of date while we're working on the axiom code further.)
Check for references to "issue453" in the code, specifically axiom_rules.py. |
2018-11-04 07:05:43 | malte | set | messages:
+ msg8047 |
2018-11-04 04:09:51 | malte | set | messages:
+ msg8040 summary: Check for references to "issue453" in the code, specifically axiom_rules.py. |
2014-10-04 19:32:46 | malte | set | keyword:
+ translator |
2014-08-21 03:10:27 | patrik | set | files:
+ disjunction2.pre |
2014-08-21 03:10:16 | patrik | set | files:
+ disjunction2.sas messages:
+ msg3332 |
2014-08-18 11:06:54 | malte | set | messages:
+ msg3323 |
2014-08-18 04:58:24 | patrik | set | messages:
+ msg3322 |
2014-08-17 11:36:20 | malte | set | messages:
+ msg3321 |
2014-08-17 09:12:59 | malte | set | messages:
+ msg3319 |
2014-08-17 03:15:11 | patrik | set | messages:
+ msg3318 |
2014-08-16 10:27:43 | malte | set | status: unread -> chatting nosy:
+ gabi messages:
+ msg3317 |
2014-08-16 03:29:55 | patrik | set | files:
+ fd-val.cc |
2014-08-16 03:29:03 | patrik | set | files:
+ disjunction2.soln |
2014-08-16 03:28:49 | patrik | set | files:
+ disjunction2.pddl |
2014-08-16 03:28:28 | patrik | create | |
|