Created on 2018-09-20.14:48:25 by jendrik, last changed by malte.
msg7799 (view) |
Author: malte |
Date: 2018-09-25.21:25:07 |
|
Many thanks, Jendrik!
|
msg7798 (view) |
Author: jendrik |
Date: 2018-09-25.18:25:47 |
|
I asked around and everybody was happy to see the synergy go away. So I went
ahead and merged this. I think we can announce the change together with the
changes we made during the sprint.
We finally did it, the synergy is gone :-)
|
msg7797 (view) |
Author: jendrik |
Date: 2018-09-25.15:42:19 |
|
Add TODO to summary.
|
msg7795 (view) |
Author: malte |
Date: 2018-09-25.13:01:09 |
|
I also left some minor comments on bitbucket.
From my side this looks good to merge, but I would suggest that you discuss it
with the others in the group to see that nobody minds seeing this go.
This is also the kind of change that would warrant an email to the fast-downward
mailing list. Actually, we have had quite a few of these in the sprint without
telling people (e.g. the evaluator/heuristic merge and what it means for the doc
pages and command line options, or the elimination of lm_cost_type). I think we
should rectify this by going over the issues merged during and after the sprint
and have a look at which things should be announced on the mailing list.
|
msg7794 (view) |
Author: malte |
Date: 2018-09-25.12:57:41 |
|
It is quite possible that issue846 won't have much of an impact on code size
either because we will still need the Exploration class (or something like it)
inside the lm_rhw factory and possibly other factories.
But I think code complexity (as opposed to size) is already reduced quite
significantly once there are fewer interactions between complicated pieces of
code. Hopefully, after issu846, the use of the Exploration class will be
localized to the landmark factories. This should make it much easier to
understand and improve the landmark heuristic itself.
And of course removing the synergy classes itself already simplifies the
landmark code by getting rid of ~250 lines of code, even if nothing else
currently depends on it.
|
msg7793 (view) |
Author: silvan |
Date: 2018-09-25.12:39:02 |
|
I had a look at the pull-request and left one minor comment. The synergy doesn't
seem to interact at all with the other classes, except for two lines; so
removing it indeed doesn't reduce the other code complexity at all, as Malte
already said. So let's hope that issue846 will simplify the landmark heuristic
significantly; otherwise, removing the synergy does not really help, does it?
|
msg7791 (view) |
Author: jendrik |
Date: 2018-09-25.11:41:48 |
|
I agree. Maybe we can just parse which planner component was run before the
segfault, i.e., the line "Driver aborting after (component)".
|
msg7789 (view) |
Author: silvan |
Date: 2018-09-25.11:13:24 |
|
Regarding the driver: I think having the real signal values passed to the user
also has some merits, and changing this for only one or some signals that would
be converted into a (positive integer) exit code would be a bit strange. This is
particularly true since these "unnormal" memory errors of Python are not
reliably reproducible. What we really would like is to turn these kinds of
segfaults into a memory exit code and ignore output on stderr, but this is
probably impossible to achieve.
|
msg7786 (view) |
Author: jendrik |
Date: 2018-09-24.23:14:15 |
|
I opened issue846 for changing the way how the lmcount heuristic computes
preferred operators.
|
msg7784 (view) |
Author: jendrik |
Date: 2018-09-24.22:24:55 |
|
I tried to replicate the Python segfault locally, but failed to do so. Even on
the grid the segfault doesn't happen every time.
|
msg7783 (view) |
Author: jendrik |
Date: 2018-09-24.17:01:33 |
|
I think that it would be good if the driver returned different exit codes
depending on whether the segfault happens in the translator or the search.
|
msg7781 (view) |
Author: silvan |
Date: 2018-09-24.09:26:28 |
|
Thanks for the great explanation of why the synergy and what it actually does in
msg7768. I think I never really understood it from that high-level perspective :-)
Regarding the segfaults: I've seen the same problem with the translator before,
even with the new emergency memory. Like Malte said, Python sometimes doesn't
seem to be able to cleanly exit when running out of memory, so it may still
throw a MemoryError or (rarely, in my experience) just segfault.
|
msg7780 (view) |
Author: jendrik |
Date: 2018-09-24.00:14:53 |
|
Yes, the experiments are based on the most recent version of Fast Downward. I
just forgot to convert --heuristic to --evaluator.
I have prepared a pull request that removes the synergy:
https://bitbucket.org/jendrikseipp/downward/pull-requests/100
|
msg7775 (view) |
Author: malte |
Date: 2018-09-23.14:44:58 |
|
Many thanks for running the experiments, Jendrik!
Are they based on a recent enough default branch where lm_cost_type is no longer
necessary? I'm asking because the option strings used are partly new (not using
lm_cost_type) and partly old (using the deprecated --heuristic).
Here are my thoughts on the lama-first results, focusing on coverage, quality
and the scores for total time, memory and evaluations as proxies of the main
things we care about. Let's abbreviate the configurations as "synergy", "pref"
and "nopref".
1. In terms of coverage and quality, nopref >> synergy > pref. This suggests the
preferred
operators of the landmark heuristic are not useful enough to use, in which
case we no
longer need the synergy. But synergy > pref also shows that the synergy
actually helps
when we want to compute preferred operators.
2. For coverage, the trends are fairly clear also on a per-domain level, with
the nurikabe domain the major exception.
3. For quality, the nopref vs. synergy picture is very mixed with neither having
a strong
advantage over the other if we ask the question "given a random domain, which one
should I prefer?". The overall quality difference is almost entirely due to
one domain
(maintenance-sat14-adl) where we would probably find out something
interesting if we
dug deeper into the failure of synergy and pref.
4. For the evaluations and memory scores, we have exactly the opposite picture:
pref > synergy > nopref. If we bear in mind that these scores always also
take into
account coverage to some extent (because unsolved tasks will score 0), this means
that using preferred operators (in either pref of synergy) is clearly good to
focus
the search, i.e., leads to fewer state evaluations and hence also less memory
use.
So the explanations for their worse coverage seems to be that they are better
informed, but substantially slower. This is corroborated by the total time
scores,
where we again have nopref >> synergy > pref, where the gap between nopref
and the
others is even more substantial.
5. An interesting question is why we have pref > synergy in terms of evaluations and
memory even though they both use preferred operators and synergy is the
faster one.
I think the answer to this is that the heuristics are not actually fully
equivalent.
For example, there are differences in how exactly the relaxed exploration is
performed. Given that the worse implementation here is the one we want to get
rid of,
I don't think we have to look into this too deeply.
For the full lama experiment, we should look at quality. (Coverage out to be the
same as in lama-first, and it is, more or less.) We see nopref >> synergy > pref
again, and the advantage of nopref is more pronounced, which makes sense because
being faster helps finding better plans in the anytime planner. While nopref is
still not dominant compared to synergy on a per-domain level, it is clearly the
better choice overall.
So, based on this, we can clearly get rid of the synergy and switch to
pref=false as the recommended setting for the lama configuration. Because this
is a parameter switch, I wonder if we should introduce a new alias, e.g. "lama",
and leave the existing alias alone, because the modified configuration would no
longer reflect the IPC 2011 version of the lama, making seq-sat-lama-2011 an
inappropriate name.
As a further step, as said before, I think it would be interesting to see if we
only considered the first kind of landmarks from msg7768, but not the second one.
|
msg7774 (view) |
Author: malte |
Date: 2018-09-23.14:11:53 |
|
Regarding the segfaults, this may be a Python issue. I think it's not the first
time we see that Python doesn't manage to exit cleanly when running out of
memory. We could try to diagnose it. This link:
https://stackoverflow.com/questions/16731115/how-to-debug-a-python-segmentation-fault?rq=1
suggests using the "faulthandler" module to help diagnose this.
But we could also leave it alone for now. If we can reproduce it, if it's
Python's fault and it if happens with current Python versions, we should report
it to the Python developers.
|
msg7773 (view) |
Author: malte |
Date: 2018-09-23.14:03:27 |
|
No objections to updating the memory limit, but as said in the lab repository
yesterday (or was it the day before?), I think we should explain where the
numbers come from. After a bit of digging, I assume that the 3872 MiB number is
based on /etc/slurm/slurm.conf:
NodeName=ase[01-24] CPUs=16 RealMemory=61964 Sockets=2 CoresPerSocket=8
ThreadsPerCore=1 TmpDisk=410000 State=UNKNOWN
NodeName=ase[31-54] CPUs=20 RealMemory=127080 Sockets=2 CoresPerSocket=10
ThreadsPerCore=1 TmpDisk=169204 State=UNKNOWN
For the first entry (the infai_1 partition), we can calculate:
61964 / 16 = 3872.75
For the second entry (the infai_2 partition), it looks like we could afford a
lot more memory:
127080 / 20 = 6354
Why isn't 3872 good enough then? In /etc/slurm/cgroup.conf, we have
AllowedRAMSpace=99 # in %
which means that the jobs are actually only allowed 99% of the overall memory,
so maybe that is part of the reason why 3872 MiB is too much.
However, it is also the case the ulimit limits we set are enforced at the level
of individual tasks, whereas I assume cgroup considers something more
comprehensive based on all processes in the group, which would include at
minimum the slurm job script, the driver and the planner component added together.
And then there is added uncertainty because we don't know if the
/etc/slurm/*.conf config files we see are actually the ones in use, as we found
out recently.
So perhaps there is no better way to figure this out than experimenting with
what works and what doesn't, but I think we should at least explain on which
basis we came up with the number, even if at some point it's a ballpark estimate.
|
msg7772 (view) |
Author: jendrik |
Date: 2018-09-23.10:20:41 |
|
Here are the results:
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue839-lama-no-syn-pref-True-vs-lama-no-syn-pref-False.html
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue839-lama-first-no-syn-pref-false-vs-lama-first-no-syn-pref-true.html
The lama-first experiment has some unusual unexplained errors. The segfault happens in the translator and I don't
know what could be causing it. There are also problems with exceeding the Slurm job memory limit (even with our new
ulimit setting). I suggest to lower our ulimit value from 3872 MiB to 3800 MiB. (The experiment data is at
/infai/seipp/projects/jendrik-downward/experiments/issue839 if you'd like to have a closer look.)
|
msg7771 (view) |
Author: jendrik |
Date: 2018-09-23.00:52:48 |
|
Experiments described in msg7767 are running.
|
msg7769 (view) |
Author: malte |
Date: 2018-09-22.10:16:43 |
|
PS: Of course all suggested changes don't have to be done in the same issue. We
can first get rid of the landmark synergy, then think about the preferred
operators of the landmark heuristic in a separate issue.
But from what I remember, ripping out the landmark synergy will only simplify
the code slightly, while ripping out the relaxed exploration parts from the
landmark heuristic will simplify it a lot, so we should definitely also use that
opportunity if we can.
Unfortunately, I don't think we will be able to physically remove the relaxed
exploration code from the landmark code because it is also needed internally
inside the lm_rhw landmark factory. But I think having a big mess only in that
factory and not also in the lmcount heuristic would already be a huge step forward.
|
msg7768 (view) |
Author: malte |
Date: 2018-09-22.10:09:43 |
|
Background for why the pref=true option of the landmark heuristic exists:
In our code, heuristics don't know whether preferred operators are going to be
needed or not, so heuristics like h^FF and h^add compute them whether or not
they are going to be used. This is OK because the overhead is not very large,
especially for h^FF, and it is very unusual not to use preferred operators anyway.
For the landmark heuristic, computing preferred operators can be expensive, and
hence we only want to do it if they end up being used. Hence the pref=true
option to enable the computation of preferred operators. If and for what the
preferred operators are going to be *used* is a different matter controlled by
the search algorithm, not the heuristic. It is currently very easily possible to
compute the preferred operators and not use them, or use them but not compute
them (= using the empty set of preferred operators), which is what we did in the
initial no-synergy experiment. Hence why we consider this an option caveat.
The basic idea of what preferred operators for the landmark heuristic are (not
sure if it is implemented exactly like that) is as follows:
1. If there is an applicable operator that reaches a landmark, all such
operators are preferred.
2. If no such operators exist, conduct an FF-style relaxed exploration towards
the nearest landmarks (according to the landmark orderings we have) and use the
preferred operators of that.
No #1 should be cheap, but no #2 is expensive because it needs to conduct a
relaxed exploration that would otherwise not be necessary. It also carries a
high cost in code maintenance. The landmark heuristic needs to include a way to
compute something like the FF heuristic internally *just* for #2 of the
preferred operator computation. So it would be great to be able to get rid of
that. Rather than ripping out preferred operators entirely, I think it would
make a lot more sense conceptually to only rip out #2, but keep #1 and have it
done always, like other preferred operator computations.
The synergy heuristics exist to reduce the computation cost of #2. The idea is
that if you want to compute h^FF and h^lm, you need to do a relaxed exploration
anyway for h^FF, so the information you're interested in for case #2 of the h^lm
preferred operators can reuse the internal data structures that have already
been computed for h^FF. If we don't want #2 any more, then there no longer is
any synergy to be exploited between the two heuristics.
|
msg7767 (view) |
Author: malte |
Date: 2018-09-22.09:52:47 |
|
Thanks, Jendrik!
That's roughly what I expected after the previous experiment. The differences in
the previous experiment seemed too large to be explained by the different
implementations, so it seemed likely that it was due to the different settings
(the synergy using preferred operators for the landmark heuristics and the
separate configuration accidentally not doing that).
The fair comparison is between the synergy-based method and the pref=true
no-synergy method, because the synergy always has the equivalent of pref=true.
(If you don't want preferred operators out of the landmarks, there is no reason
to compute them at the same time as the FF heuristic.)
So in the initial experiment we accidentally did an apples and oranges
comparison, which led us to conclude that the synergy implementation performs
worse. That's almost exactly what happened the first time we considered this
(issue123). Going over these old comments, especially the first 13 or so, there
are lots of parallels, and I found it interesting to reread these. Lots of
points raised there about "option caveats" are still equally valid. It's great
that at least we managed to clean up the lm_cost_type mess now.
Now, the very interesting thing here is that based on the data we have now,
pref=true just seems to be a poor setting, which would mean we could get rid of
the synergy and also simplify the basic lmcount heuristic a lot, which would be
great. In the old experiment (issue123) we came to a different conclusion, but
the code and benchmark set have evolved since then. (You can see that in this
experiment, almost all the differences come from "new" domains, i.e. IPC 2011 or
later.)
BTW, the experiments don't include the IPC 2018 domains. Can they be included?
Ripping out the synergies and the pref option would be a rather significant
change, so before we do something like that, I think we should have
1) the previous experiment extended to also cover the IPC 2018 domains, and
2) an equivalent experiment using the full anytime configurations rather than
just lama-first.
If we want to save time, I'm fine with doing the experiments just with the
32-bit builds, since I assume no matter what we end up with, we will be able to
make the 64-bit performance reasonable.
I'll try to explain some of the technical background (why we actually have the
synergy heuristics and the pref option) in a separate message.
|
msg7766 (view) |
Author: jendrik |
Date: 2018-09-22.09:07:29 |
|
Here are the results:
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-lama-first-pref-30min.html
Interestingly, using pref=false leads to a higher total coverage than pref=true.
|
msg7764 (view) |
Author: jendrik |
Date: 2018-09-22.00:53:12 |
|
Experiments are running.
|
msg7693 (view) |
Author: malte |
Date: 2018-09-21.08:11:11 |
|
Looks like we should have set "pref=true" in the experiments for "lama-separate"
in issue213. See http://www.fast-downward.org/OptionCaveats. Perhaps the best
thing is to redo the lama-related experiments from issue213 with and without
that setting.
|
msg7656 (view) |
Author: silvan |
Date: 2018-09-20.16:51:51 |
|
Yay, another attempt!
|
msg7651 (view) |
Author: malte |
Date: 2018-09-20.15:58:28 |
|
> if we can get rid of the synergy without a loss in coverage
...or quality, in the case of the full lama configuration
|
msg7644 (view) |
Author: jendrik |
Date: 2018-09-20.14:48:24 |
|
In issue123 we tried to get rid of the synergy, but at the time the using the
synergy achied better results than using the ff() and lmcount() heuristics by
themselves. Experiments for issue213 (msg7629) suggest that this has changed now
and we'd like to see if we can get rid of the synergy without a loss in coverage.
|
|
Date |
User |
Action |
Args |
2018-09-25 21:25:07 | malte | set | messages:
+ msg7799 |
2018-09-25 18:25:47 | jendrik | set | status: reviewing -> resolved messages:
+ msg7798 summary: See also issue306 if we end up getting rid of the "pref" option of lmcount.
After merging this, announce the change on the mailing list. -> |
2018-09-25 15:42:19 | jendrik | set | messages:
+ msg7797 summary: See also issue306 if we end up getting rid of the "pref" option of lmcount. -> See also issue306 if we end up getting rid of the "pref" option of lmcount.
After merging this, announce the change on the mailing list. |
2018-09-25 13:01:10 | malte | set | messages:
+ msg7795 |
2018-09-25 12:57:41 | malte | set | messages:
+ msg7794 |
2018-09-25 12:39:02 | silvan | set | messages:
+ msg7793 |
2018-09-25 11:41:48 | jendrik | set | messages:
+ msg7791 |
2018-09-25 11:13:24 | silvan | set | messages:
+ msg7789 |
2018-09-24 23:14:15 | jendrik | set | messages:
+ msg7786 |
2018-09-24 22:24:55 | jendrik | set | messages:
+ msg7784 |
2018-09-24 17:01:33 | jendrik | set | messages:
+ msg7783 |
2018-09-24 09:26:28 | silvan | set | messages:
+ msg7781 |
2018-09-24 00:14:53 | jendrik | set | status: in-progress -> reviewing messages:
+ msg7780 |
2018-09-23 14:44:58 | malte | set | messages:
+ msg7775 |
2018-09-23 14:11:53 | malte | set | messages:
+ msg7774 |
2018-09-23 14:03:27 | malte | set | messages:
+ msg7773 |
2018-09-23 10:20:42 | jendrik | set | messages:
+ msg7772 |
2018-09-23 00:52:48 | jendrik | set | status: chatting -> in-progress assignedto: jendrik messages:
+ msg7771 |
2018-09-22 10:16:43 | malte | set | messages:
+ msg7769 |
2018-09-22 10:09:43 | malte | set | messages:
+ msg7768 |
2018-09-22 09:52:47 | malte | set | messages:
+ msg7767 summary: See also issue306 if we end up getting rid of the "pref" option of lmcount. |
2018-09-22 09:07:29 | jendrik | set | messages:
+ msg7766 |
2018-09-22 00:53:12 | jendrik | set | messages:
+ msg7764 |
2018-09-21 08:24:01 | cedric | set | nosy:
+ cedric |
2018-09-21 08:11:11 | malte | set | messages:
+ msg7693 |
2018-09-20 16:51:52 | silvan | set | nosy:
+ silvan messages:
+ msg7656 |
2018-09-20 15:58:28 | malte | set | status: unread -> chatting messages:
+ msg7651 |
2018-09-20 14:48:25 | jendrik | create | |
|