Issue 839: remove FF-landmark synergy - Fast Downward issue tracker

Title	remove FF-landmark synergy
Priority	wish	Status	resolved
Superseder		Nosy List	cedric, jendrik, malte, silvan
Assigned To	jendrik	Keywords
Optional summary

Created on 2018-09-20.14:48:25 by jendrik, last changed by malte.

Messages
msg7799 (view)	Author: malte	Date: 2018-09-25.21:25:07
Many thanks, Jendrik!
msg7798 (view)	Author: jendrik	Date: 2018-09-25.18:25:47
I asked around and everybody was happy to see the synergy go away. So I went ahead and merged this. I think we can announce the change together with the changes we made during the sprint. We finally did it, the synergy is gone :-)
msg7797 (view)	Author: jendrik	Date: 2018-09-25.15:42:19
Add TODO to summary.
msg7795 (view)	Author: malte	Date: 2018-09-25.13:01:09
I also left some minor comments on bitbucket. From my side this looks good to merge, but I would suggest that you discuss it with the others in the group to see that nobody minds seeing this go. This is also the kind of change that would warrant an email to the fast-downward mailing list. Actually, we have had quite a few of these in the sprint without telling people (e.g. the evaluator/heuristic merge and what it means for the doc pages and command line options, or the elimination of lm_cost_type). I think we should rectify this by going over the issues merged during and after the sprint and have a look at which things should be announced on the mailing list.
msg7794 (view)	Author: malte	Date: 2018-09-25.12:57:41
It is quite possible that issue846 won't have much of an impact on code size either because we will still need the Exploration class (or something like it) inside the lm_rhw factory and possibly other factories. But I think code complexity (as opposed to size) is already reduced quite significantly once there are fewer interactions between complicated pieces of code. Hopefully, after issu846, the use of the Exploration class will be localized to the landmark factories. This should make it much easier to understand and improve the landmark heuristic itself. And of course removing the synergy classes itself already simplifies the landmark code by getting rid of ~250 lines of code, even if nothing else currently depends on it.
msg7793 (view)	Author: silvan	Date: 2018-09-25.12:39:02
I had a look at the pull-request and left one minor comment. The synergy doesn't seem to interact at all with the other classes, except for two lines; so removing it indeed doesn't reduce the other code complexity at all, as Malte already said. So let's hope that issue846 will simplify the landmark heuristic significantly; otherwise, removing the synergy does not really help, does it?
msg7791 (view)	Author: jendrik	Date: 2018-09-25.11:41:48
I agree. Maybe we can just parse which planner component was run before the segfault, i.e., the line "Driver aborting after (component)".
msg7789 (view)	Author: silvan	Date: 2018-09-25.11:13:24
Regarding the driver: I think having the real signal values passed to the user also has some merits, and changing this for only one or some signals that would be converted into a (positive integer) exit code would be a bit strange. This is particularly true since these "unnormal" memory errors of Python are not reliably reproducible. What we really would like is to turn these kinds of segfaults into a memory exit code and ignore output on stderr, but this is probably impossible to achieve.
msg7786 (view)	Author: jendrik	Date: 2018-09-24.23:14:15
I opened issue846 for changing the way how the lmcount heuristic computes preferred operators.
msg7784 (view)	Author: jendrik	Date: 2018-09-24.22:24:55
I tried to replicate the Python segfault locally, but failed to do so. Even on the grid the segfault doesn't happen every time.
msg7783 (view)	Author: jendrik	Date: 2018-09-24.17:01:33
I think that it would be good if the driver returned different exit codes depending on whether the segfault happens in the translator or the search.
msg7781 (view)	Author: silvan	Date: 2018-09-24.09:26:28
Thanks for the great explanation of why the synergy and what it actually does in msg7768. I think I never really understood it from that high-level perspective :-) Regarding the segfaults: I've seen the same problem with the translator before, even with the new emergency memory. Like Malte said, Python sometimes doesn't seem to be able to cleanly exit when running out of memory, so it may still throw a MemoryError or (rarely, in my experience) just segfault.
msg7780 (view)	Author: jendrik	Date: 2018-09-24.00:14:53
Yes, the experiments are based on the most recent version of Fast Downward. I just forgot to convert --heuristic to --evaluator. I have prepared a pull request that removes the synergy: https://bitbucket.org/jendrikseipp/downward/pull-requests/100
msg7775 (view)	Author: malte	Date: 2018-09-23.14:44:58
Many thanks for running the experiments, Jendrik! Are they based on a recent enough default branch where lm_cost_type is no longer necessary? I'm asking because the option strings used are partly new (not using lm_cost_type) and partly old (using the deprecated --heuristic). Here are my thoughts on the lama-first results, focusing on coverage, quality and the scores for total time, memory and evaluations as proxies of the main things we care about. Let's abbreviate the configurations as "synergy", "pref" and "nopref". 1. In terms of coverage and quality, nopref >> synergy > pref. This suggests the preferred operators of the landmark heuristic are not useful enough to use, in which case we no longer need the synergy. But synergy > pref also shows that the synergy actually helps when we want to compute preferred operators. 2. For coverage, the trends are fairly clear also on a per-domain level, with the nurikabe domain the major exception. 3. For quality, the nopref vs. synergy picture is very mixed with neither having a strong advantage over the other if we ask the question "given a random domain, which one should I prefer?". The overall quality difference is almost entirely due to one domain (maintenance-sat14-adl) where we would probably find out something interesting if we dug deeper into the failure of synergy and pref. 4. For the evaluations and memory scores, we have exactly the opposite picture: pref > synergy > nopref. If we bear in mind that these scores always also take into account coverage to some extent (because unsolved tasks will score 0), this means that using preferred operators (in either pref of synergy) is clearly good to focus the search, i.e., leads to fewer state evaluations and hence also less memory use. So the explanations for their worse coverage seems to be that they are better informed, but substantially slower. This is corroborated by the total time scores, where we again have nopref >> synergy > pref, where the gap between nopref and the others is even more substantial. 5. An interesting question is why we have pref > synergy in terms of evaluations and memory even though they both use preferred operators and synergy is the faster one. I think the answer to this is that the heuristics are not actually fully equivalent. For example, there are differences in how exactly the relaxed exploration is performed. Given that the worse implementation here is the one we want to get rid of, I don't think we have to look into this too deeply. For the full lama experiment, we should look at quality. (Coverage out to be the same as in lama-first, and it is, more or less.) We see nopref >> synergy > pref again, and the advantage of nopref is more pronounced, which makes sense because being faster helps finding better plans in the anytime planner. While nopref is still not dominant compared to synergy on a per-domain level, it is clearly the better choice overall. So, based on this, we can clearly get rid of the synergy and switch to pref=false as the recommended setting for the lama configuration. Because this is a parameter switch, I wonder if we should introduce a new alias, e.g. "lama", and leave the existing alias alone, because the modified configuration would no longer reflect the IPC 2011 version of the lama, making seq-sat-lama-2011 an inappropriate name. As a further step, as said before, I think it would be interesting to see if we only considered the first kind of landmarks from msg7768, but not the second one.
msg7774 (view)	Author: malte	Date: 2018-09-23.14:11:53
Regarding the segfaults, this may be a Python issue. I think it's not the first time we see that Python doesn't manage to exit cleanly when running out of memory. We could try to diagnose it. This link: https://stackoverflow.com/questions/16731115/how-to-debug-a-python-segmentation-fault?rq=1 suggests using the "faulthandler" module to help diagnose this. But we could also leave it alone for now. If we can reproduce it, if it's Python's fault and it if happens with current Python versions, we should report it to the Python developers.
msg7773 (view)	Author: malte	Date: 2018-09-23.14:03:27
No objections to updating the memory limit, but as said in the lab repository yesterday (or was it the day before?), I think we should explain where the numbers come from. After a bit of digging, I assume that the 3872 MiB number is based on /etc/slurm/slurm.conf: NodeName=ase[01-24] CPUs=16 RealMemory=61964 Sockets=2 CoresPerSocket=8 ThreadsPerCore=1 TmpDisk=410000 State=UNKNOWN NodeName=ase[31-54] CPUs=20 RealMemory=127080 Sockets=2 CoresPerSocket=10 ThreadsPerCore=1 TmpDisk=169204 State=UNKNOWN For the first entry (the infai_1 partition), we can calculate: 61964 / 16 = 3872.75 For the second entry (the infai_2 partition), it looks like we could afford a lot more memory: 127080 / 20 = 6354 Why isn't 3872 good enough then? In /etc/slurm/cgroup.conf, we have AllowedRAMSpace=99 # in % which means that the jobs are actually only allowed 99% of the overall memory, so maybe that is part of the reason why 3872 MiB is too much. However, it is also the case the ulimit limits we set are enforced at the level of individual tasks, whereas I assume cgroup considers something more comprehensive based on all processes in the group, which would include at minimum the slurm job script, the driver and the planner component added together. And then there is added uncertainty because we don't know if the /etc/slurm/*.conf config files we see are actually the ones in use, as we found out recently. So perhaps there is no better way to figure this out than experimenting with what works and what doesn't, but I think we should at least explain on which basis we came up with the number, even if at some point it's a ballpark estimate.
msg7772 (view)	Author: jendrik	Date: 2018-09-23.10:20:41
Here are the results: https://ai.dmi.unibas.ch/_tmp_files/seipp/issue839-lama-no-syn-pref-True-vs-lama-no-syn-pref-False.html https://ai.dmi.unibas.ch/_tmp_files/seipp/issue839-lama-first-no-syn-pref-false-vs-lama-first-no-syn-pref-true.html The lama-first experiment has some unusual unexplained errors. The segfault happens in the translator and I don't know what could be causing it. There are also problems with exceeding the Slurm job memory limit (even with our new ulimit setting). I suggest to lower our ulimit value from 3872 MiB to 3800 MiB. (The experiment data is at /infai/seipp/projects/jendrik-downward/experiments/issue839 if you'd like to have a closer look.)
msg7771 (view)	Author: jendrik	Date: 2018-09-23.00:52:48
Experiments described in msg7767 are running.
msg7769 (view)	Author: malte	Date: 2018-09-22.10:16:43
PS: Of course all suggested changes don't have to be done in the same issue. We can first get rid of the landmark synergy, then think about the preferred operators of the landmark heuristic in a separate issue. But from what I remember, ripping out the landmark synergy will only simplify the code slightly, while ripping out the relaxed exploration parts from the landmark heuristic will simplify it a lot, so we should definitely also use that opportunity if we can. Unfortunately, I don't think we will be able to physically remove the relaxed exploration code from the landmark code because it is also needed internally inside the lm_rhw landmark factory. But I think having a big mess only in that factory and not also in the lmcount heuristic would already be a huge step forward.
msg7768 (view)	Author: malte	Date: 2018-09-22.10:09:43
Background for why the pref=true option of the landmark heuristic exists: In our code, heuristics don't know whether preferred operators are going to be needed or not, so heuristics like h^FF and h^add compute them whether or not they are going to be used. This is OK because the overhead is not very large, especially for h^FF, and it is very unusual not to use preferred operators anyway. For the landmark heuristic, computing preferred operators can be expensive, and hence we only want to do it if they end up being used. Hence the pref=true option to enable the computation of preferred operators. If and for what the preferred operators are going to be used is a different matter controlled by the search algorithm, not the heuristic. It is currently very easily possible to compute the preferred operators and not use them, or use them but not compute them (= using the empty set of preferred operators), which is what we did in the initial no-synergy experiment. Hence why we consider this an option caveat. The basic idea of what preferred operators for the landmark heuristic are (not sure if it is implemented exactly like that) is as follows: 1. If there is an applicable operator that reaches a landmark, all such operators are preferred. 2. If no such operators exist, conduct an FF-style relaxed exploration towards the nearest landmarks (according to the landmark orderings we have) and use the preferred operators of that. No #1 should be cheap, but no #2 is expensive because it needs to conduct a relaxed exploration that would otherwise not be necessary. It also carries a high cost in code maintenance. The landmark heuristic needs to include a way to compute something like the FF heuristic internally just for #2 of the preferred operator computation. So it would be great to be able to get rid of that. Rather than ripping out preferred operators entirely, I think it would make a lot more sense conceptually to only rip out #2, but keep #1 and have it done always, like other preferred operator computations. The synergy heuristics exist to reduce the computation cost of #2. The idea is that if you want to compute h^FF and h^lm, you need to do a relaxed exploration anyway for h^FF, so the information you're interested in for case #2 of the h^lm preferred operators can reuse the internal data structures that have already been computed for h^FF. If we don't want #2 any more, then there no longer is any synergy to be exploited between the two heuristics.
msg7767 (view)	Author: malte	Date: 2018-09-22.09:52:47
Thanks, Jendrik! That's roughly what I expected after the previous experiment. The differences in the previous experiment seemed too large to be explained by the different implementations, so it seemed likely that it was due to the different settings (the synergy using preferred operators for the landmark heuristics and the separate configuration accidentally not doing that). The fair comparison is between the synergy-based method and the pref=true no-synergy method, because the synergy always has the equivalent of pref=true. (If you don't want preferred operators out of the landmarks, there is no reason to compute them at the same time as the FF heuristic.) So in the initial experiment we accidentally did an apples and oranges comparison, which led us to conclude that the synergy implementation performs worse. That's almost exactly what happened the first time we considered this (issue123). Going over these old comments, especially the first 13 or so, there are lots of parallels, and I found it interesting to reread these. Lots of points raised there about "option caveats" are still equally valid. It's great that at least we managed to clean up the lm_cost_type mess now. Now, the very interesting thing here is that based on the data we have now, pref=true just seems to be a poor setting, which would mean we could get rid of the synergy and also simplify the basic lmcount heuristic a lot, which would be great. In the old experiment (issue123) we came to a different conclusion, but the code and benchmark set have evolved since then. (You can see that in this experiment, almost all the differences come from "new" domains, i.e. IPC 2011 or later.) BTW, the experiments don't include the IPC 2018 domains. Can they be included? Ripping out the synergies and the pref option would be a rather significant change, so before we do something like that, I think we should have 1) the previous experiment extended to also cover the IPC 2018 domains, and 2) an equivalent experiment using the full anytime configurations rather than just lama-first. If we want to save time, I'm fine with doing the experiments just with the 32-bit builds, since I assume no matter what we end up with, we will be able to make the 64-bit performance reasonable. I'll try to explain some of the technical background (why we actually have the synergy heuristics and the pref option) in a separate message.
msg7766 (view)	Author: jendrik	Date: 2018-09-22.09:07:29
Here are the results: https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-lama-first-pref-30min.html Interestingly, using pref=false leads to a higher total coverage than pref=true.
msg7764 (view)	Author: jendrik	Date: 2018-09-22.00:53:12
Experiments are running.
msg7693 (view)	Author: malte	Date: 2018-09-21.08:11:11
Looks like we should have set "pref=true" in the experiments for "lama-separate" in issue213. See http://www.fast-downward.org/OptionCaveats. Perhaps the best thing is to redo the lama-related experiments from issue213 with and without that setting.
msg7656 (view)	Author: silvan	Date: 2018-09-20.16:51:51
Yay, another attempt!
msg7651 (view)	Author: malte	Date: 2018-09-20.15:58:28
> if we can get rid of the synergy without a loss in coverage ...or quality, in the case of the full lama configuration
msg7644 (view)	Author: jendrik	Date: 2018-09-20.14:48:24
In issue123 we tried to get rid of the synergy, but at the time the using the synergy achied better results than using the ff() and lmcount() heuristics by themselves. Experiments for issue213 (msg7629) suggest that this has changed now and we'd like to see if we can get rid of the synergy without a loss in coverage.

History
Date	User	Action	Args
2018-09-25 21:25:07	malte	set	messages: + msg7799
2018-09-25 18:25:47	jendrik	set	status: reviewing -> resolved messages: + msg7798 summary: See also issue306 if we end up getting rid of the "pref" option of lmcount. After merging this, announce the change on the mailing list. ->
2018-09-25 15:42:19	jendrik	set	messages: + msg7797 summary: See also issue306 if we end up getting rid of the "pref" option of lmcount. -> See also issue306 if we end up getting rid of the "pref" option of lmcount. After merging this, announce the change on the mailing list.
2018-09-25 13:01:10	malte	set	messages: + msg7795
2018-09-25 12:57:41	malte	set	messages: + msg7794
2018-09-25 12:39:02	silvan	set	messages: + msg7793
2018-09-25 11:41:48	jendrik	set	messages: + msg7791
2018-09-25 11:13:24	silvan	set	messages: + msg7789
2018-09-24 23:14:15	jendrik	set	messages: + msg7786
2018-09-24 22:24:55	jendrik	set	messages: + msg7784
2018-09-24 17:01:33	jendrik	set	messages: + msg7783
2018-09-24 09:26:28	silvan	set	messages: + msg7781
2018-09-24 00:14:53	jendrik	set	status: in-progress -> reviewing messages: + msg7780
2018-09-23 14:44:58	malte	set	messages: + msg7775
2018-09-23 14:11:53	malte	set	messages: + msg7774
2018-09-23 14:03:27	malte	set	messages: + msg7773
2018-09-23 10:20:42	jendrik	set	messages: + msg7772
2018-09-23 00:52:48	jendrik	set	status: chatting -> in-progress assignedto: jendrik messages: + msg7771
2018-09-22 10:16:43	malte	set	messages: + msg7769
2018-09-22 10:09:43	malte	set	messages: + msg7768
2018-09-22 09:52:47	malte	set	messages: + msg7767 summary: See also issue306 if we end up getting rid of the "pref" option of lmcount.
2018-09-22 09:07:29	jendrik	set	messages: + msg7766
2018-09-22 00:53:12	jendrik	set	messages: + msg7764
2018-09-21 08:24:01	cedric	set	nosy: + cedric
2018-09-21 08:11:11	malte	set	messages: + msg7693
2018-09-20 16:51:52	silvan	set	nosy: + silvan messages: + msg7656
2018-09-20 15:58:28	malte	set	status: unread -> chatting messages: + msg7651
2018-09-20 14:48:25	jendrik	create

Issue839