msg4788 (view) |
Author: malte |
Date: 2015-11-12.18:10:14 |
|
Thanks for the quick fix, Jendrik! :-) I assume this requires no changes in
common_setup.py, right? (If it doesn't, no need to reply.)
Given that this was a lab issue, we won't need the branch I started in my
bitbucket repository for this, so I will strip it to avoid accidentally pushing
it to master later. So if you pulled this, you'll have to strip it, too.
|
msg4787 (view) |
Author: jendrik |
Date: 2015-11-12.15:47:38 |
|
Rerunning the test experiment showed no unexpected errors and the correct
revisions are compared. A new lab version has been released.
|
msg4786 (view) |
Author: jendrik |
Date: 2015-11-12.14:00:51 |
|
Yes, this only affects experiments comparing *mutiple* revisions.
|
msg4785 (view) |
Author: silvan |
Date: 2015-11-12.13:03:55 |
|
Jendrik, does this only affect experiments where we use more than one revision?
I already ran lots of experiments on research branches, and at least after
adding a new option for example, the correct revision must have been used,
because otherwise, the option would not have been accepted by Fast Downward.
|
msg4784 (view) |
Author: jendrik |
Date: 2015-11-12.12:30:45 |
|
The bug is now fixed in lab. I'll rerun the issue481 experiment and tag a new lab
bugfix release afterwards.
|
msg4783 (view) |
Author: jendrik |
Date: 2015-11-12.12:13:32 |
|
Thanks to your investigation I could pinpoint the error to the new whole-
planner experiment class. There's a bug that causes all "revisions" to use the
same (random) revision. I'll report back once I fixed the bug.
I've changed the title and added Silvan to the nosy list since he's already
using the new experiment class for comparing revisions.
|
msg4782 (view) |
Author: malte |
Date: 2015-11-12.11:17:42 |
|
I've run a few more tests with issue481, and I get the following curious result:
- If I run the baseline revision and the issue branch revision in the
experiment, then I get the unexplained errors for both revisions. For the issue
branch, this is not so surprising because it currently comments out the signal
handler for debug reasons. However, for the baseline revision it is surprising.
- If I run the same experiment, but only using the baseline revision (i.e. the
only change is I remove one of the revisions), then the baseline revision
doesn't produce errors any more. That is, the baseline revision *only produces
errors if the issue branch revision is also part of the experiment*.
Of course, there may also be random failures involved, and the above observation
might be the outcome of a random process. But I suspect there is something more
going on there.
One possible explanation is that the wrong code is run or that the results are
somehow jumbled up during fetching or report generation. I don't have time to
look into this more at the moment, but I'll try to look at it again later.
|
msg4781 (view) |
Author: malte |
Date: 2015-11-12.10:34:04 |
|
OK, the issue481 experiment seems to fail reproducibly, or at least it failed
similarly on second attempt. I made a smaller version of it that only considers
the floortile domain, and I got similar errors as before (in the issue481 branch
under v2-*.py).
I'll try to look into this a bit more over the next days if I can find the time.
|
msg4780 (view) |
Author: malte |
Date: 2015-11-12.09:41:09 |
|
No problems with the grid experiment either, so this cannot be reproduced for
now. I'll try to repeat the original experiment from issue481 to see if the
errors there are reproducible. If not, I'd close this for now, since it might
just be a sporadic grid issue (although I don't really know how that could be
the case).
|
msg4777 (view) |
Author: malte |
Date: 2015-11-12.07:05:59 |
|
Let's wait for the outcome of the experiments. But I think one difference is
that we used to set the memory limit within lab, whereas now we rely on setting
the memory limit in the driver script (with the driver option for setting memory
limits). My understanding is that the new code doesn't set a memory limit within
lab at all. Is that right? (If it does set one, with which method is it set and
how high is it?)
|
msg4776 (view) |
Author: jendrik |
Date: 2015-11-12.00:27:42 |
|
Hmm, I've looked at the way the memory limit is set by the old and new experiment
classes and couldn't find a meaningful difference.
|
msg4775 (view) |
Author: malte |
Date: 2015-11-11.23:58:03 |
|
I've started an experiment on maia, but the queue is very full, so I don't
expect it to run soon.
In the meantime I've also tried to reproduce this manually with the mentioned
revision by running
./fast-downward.py --search-memory-limit=2G seq-p04-007.pddl --search
"eager_greedy(ff())"
manually on different machines. (This is from the floortile-sat11-strips domain;
I had copied the PDDL files to the current directory.) I've also tried with 128M
instead of 2G.
So far, none of the manual attempts could reproduce this. I tried on my home
desktop, on maia, and on ase01. In all six cases (three machines, two memory
limits), the planner shut down cleanly after hitting the memory limit. So it
looks like either we can't reproduce it at all, or we can only reproduce it when
running within a grid job.
If the latter is the case, it may be due to some interaction with the latest
version of lab, since the main thing that has changed recently in this
department is the lab upgrade and usage of whole-planner experiments.
I'll send another update when the grid experiment is done.
|
msg4774 (view) |
Author: malte |
Date: 2015-11-11.22:56:37 |
|
I'm not sure if I have time to really work on this, but I can try to reproduce
it and find out when this was introduced. I've started a pull request here for this:
https://bitbucket.org/malte/downward/pull-requests/5/issue594-dont-let-bad_alloc-escape-from/diff
|
msg4773 (view) |
Author: jendrik |
Date: 2015-11-11.22:38:03 |
|
Working on issue481 we noticed that the planner is often aborted when it runs
out of memory without our out-of-memory handler being called. This happens e.g.
in revision 6642b246b180 using the configuration ["--search",
"eager_greedy(ff())"]. The "floortile-sat11-strips" domain should provide a
good test suite since the error happens very often there. We should try to find
out where this regression happened and fix it.
|
|
Date |
User |
Action |
Args |
2015-11-12 18:10:14 | malte | set | messages:
+ msg4788 |
2015-11-12 15:47:38 | jendrik | set | status: in-progress -> resolved assignedto: jendrik messages:
+ msg4787 |
2015-11-12 14:00:51 | jendrik | set | messages:
+ msg4786 |
2015-11-12 13:03:56 | silvan | set | messages:
+ msg4785 |
2015-11-12 12:30:45 | jendrik | set | messages:
+ msg4784 |
2015-11-12 12:13:32 | jendrik | set | status: chatting -> in-progress nosy:
+ silvan messages:
+ msg4783 title: fix catching out-of-memory errors -> lab: use correct revisions in FastDownwardExperiment |
2015-11-12 11:17:42 | malte | set | messages:
+ msg4782 |
2015-11-12 10:34:04 | malte | set | messages:
+ msg4781 |
2015-11-12 09:41:09 | malte | set | messages:
+ msg4780 |
2015-11-12 07:05:59 | malte | set | messages:
+ msg4777 |
2015-11-12 00:27:42 | jendrik | set | messages:
+ msg4776 |
2015-11-11 23:58:03 | malte | set | messages:
+ msg4775 |
2015-11-11 22:56:37 | malte | set | status: unread -> chatting messages:
+ msg4774 |
2015-11-11 22:38:03 | jendrik | create | |