Created on 2014-09-18.17:51:08 by florian, last changed by jendrik.
File name |
Uploaded |
Type |
Edit |
Remove |
planner.cc
|
malte,
2014-09-20.02:20:32
|
text/x-c++src |
|
|
msg3545 (view) |
Author: jendrik |
Date: 2014-09-25.15:38:17 |
|
I'd say it's easiest to wait with removing EXIT_TIMEOUT until after issue414 is
merged. No issue necessary for this, I'll add it to my TODO list.
|
msg3527 (view) |
Author: malte |
Date: 2014-09-24.13:23:18 |
|
Signals are meant to live in a different parameter ranges than "normal" exit
codes (in the range 0..127), a signal suggests that something was terminated due
to external reasons that it could do nothing about. (A process can choose to
ignore a soft time limit by blocking the signal, but that will only lead to an
unblockable kill once the hard limit is reached.)
Since the portfolio code terminates itself without involving external time
limits and similar shenanigans, I think raising a signal isn't appropriate in
that case. So I'm happy with the status quo regarding the signals and exit codes
that are used.
Removing EXIT_TIMEOUT in the planner may be a good idea. People might expect it
to be used when specifying a search timeout with our new options, but it isn't.
Once the new driver script is merged, we should then adapt
http://www.fast-downward.org/PlannerUsage?highlight=%28timeout%29
since this currently documents the exit codes as part of the search executable,
but the portfolio-only ones would be part of the driver script instead.
|
msg3523 (view) |
Author: jendrik |
Date: 2014-09-24.11:13:03 |
|
Maybe we should remove EXIT_TIMEOUT from the planner code then?
Currently, the portfolio code exits with EXIT_TIMEOUT if all configurations exited with
SIGXCPU. Should it exit with SIGXCPU instead, too?
It seems odd to have too different exit codes for the same thing, but I understand that
one is calculated and one is a signal and probably we shouldn't mix the too.
If we don't act on the points above, I think both the portfolio code and lab already
handle SIGXCPU exit codes correctly.
|
msg3519 (view) |
Author: malte |
Date: 2014-09-24.00:56:09 |
|
If a process receives a signal, then the default behaviour is to terminate with
that signal, so there is a certain consistency in that. If we don't have a
particular reason to change this, I think it makes sense to keep the default
behaviour. (Also, it's hard or impossible to guarantee that our signal handler
will always work as desired under all circumstance, e.g. on extremely loaded
machines, and in cases where it doesn't work as intended, we'll also receive a
SIGXCPU result that we have no control over.)
|
msg3517 (view) |
Author: jendrik |
Date: 2014-09-23.22:09:08 |
|
In case we hit the soft time limit, why do we want the planner to "exit with"
signal SIGXCPU instead of with EXIT_TIMEOUT?
|
msg3515 (view) |
Author: malte |
Date: 2014-09-23.19:34:28 |
|
> I don't know about any other ways we use to determine a timeout. If there are
> any in lab, I suggest we open an issue in the lab tracker. Jendrik, what do you
> think?
utilities.{h,cc} and the portfolio code define an exit code called
"EXIT_TIMEOUT", but it seems that it's currently impossible for the planner to
generate this.
|
msg3513 (view) |
Author: florian |
Date: 2014-09-23.19:12:58 |
|
> we should now *always* receive a raw_memory value -- if we don't,
> this should count as an unexplained error.
I used a custom parser in this experiment, so a regular lab experiment will not
contain this value. I'll send a pull request to Jendrik to include it in lab.
> This would mean that there would be a new signal with
> which the code could terminate (SIGXCPU), so this should be interpreted as a
> regular error signifying "timeout" in the portfolio error code analysis and lab
> error code analysis.
As far as I know, lab already tests for the exit code 128 + 24 (signal SIGXCPU)
and considers it a timeout. Our signal handler just prints a message and then
raises the original signal again, so the exit code is not changed by the soft limit.
> In fact, once this works, *other* error codes should
> perhaps no longer be considered valid indications of timeout. (Don't know which
> criteria we use for this at the moment.) How do we proceed with this? Open a new
> issue, or is it simple enough that we just add the additional line?
I don't know about any other ways we use to determine a timeout. If there are
any in lab, I suggest we open an issue in the lab tracker. Jendrik, what do you
think?
|
msg3512 (view) |
Author: malte |
Date: 2014-09-23.18:43:42 |
|
Given that this issue was already landed (I wrote my comment simultaneously with
yours), I have now gone ahead and added the additional line to the codebase. I
don't think we necessarily need to add an experiment for this, but of course it
wouldn't hurt, and lab + portfolio should be adapted for the new behaviour. This
means that we should now *always* receive a raw_memory value -- if we don't,
this should count as an unexplained error.
|
msg3511 (view) |
Author: malte |
Date: 2014-09-23.18:41:58 |
|
Excellent!
> Reporting raw memory usage does not work if the task is killed by the grid
> engine for exhausting its time limit.
I made a little test: the planner would terminate gracefully at "soft" time
limits if we applied the following patch:
$ hg diff
diff -r 1b71442ead73 src/search/utilities.cc
--- a/src/search/utilities.cc Tue Sep 23 12:04:24 2014 +0200
+++ b/src/search/utilities.cc Tue Sep 23 18:39:13 2014 +0200
@@ -42,6 +42,7 @@
signal(SIGTERM, signal_handler);
signal(SIGSEGV, signal_handler);
signal(SIGINT, signal_handler);
+ signal(SIGXCPU, signal_handler);
}
#if OPERATING_SYSTEM == LINUX || OPERATING_SYSTEM == OSX
IIRC, lab already uses a soft + hard limit, so I don't think we would need to
change anything in lab. This would mean that there would be a new signal with
which the code could terminate (SIGXCPU), so this should be interpreted as a
regular error signifying "timeout" in the portfolio error code analysis and lab
error code analysis. In fact, once this works, *other* error codes should
perhaps no longer be considered valid indications of timeout. (Don't know which
criteria we use for this at the moment.) How do we proceed with this? Open a new
issue, or is it simple enough that we just add the additional line?
|
msg3510 (view) |
Author: florian |
Date: 2014-09-23.18:38:41 |
|
Merged.
|
msg3508 (view) |
Author: florian |
Date: 2014-09-23.17:06:03 |
|
Results look good for blind search on all tasks from satisficing and optimal
suites. There are no more unexplained errors in the new version and memory is
reported correctly even if the planner runs out of memory.
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue469-issue469.html
Reporting raw memory usage does not work if the task is killed by the grid
engine for exhausting its time limit.
|
msg3501 (view) |
Author: florian |
Date: 2014-09-22.22:38:13 |
|
For the record: we looked at the results for this experiment offline and it
showed errors for low reserves (4, 8, 16 KiB) and high reserves (128, 256, 512).
Low reserves likely cause an error because the free memory is not sufficient to
allocate the file stream's buffer. We only have have of the reserve in this
experiment and need a little over 8 KiB (msg3476).
High reserves can lead to the failed assertion that started this issue (see
point 3. in msg3473).
We are now running experiments on a version that uses no buffer for the stream
and a 32 KiB memory buffer.
|
msg3493 (view) |
Author: malte |
Date: 2014-09-21.18:32:14 |
|
No matter what I do, I have trouble reproducing the issue with the current code.
You said that you had to play around with the revision cache in the previous
experiment. Is it possible that the compile that was ultimately used happened in
a slightly different environment than most experiments, maybe compiled on a
different machine?
The ingredients for the failure seem to be there; it just doesn't seem to arise
any more. I've pushed some changes that might make it more likely for the
problem to arise. Can you run some additional experiments with the new code?
There is a new option "--additional-memory-pressure". To minimize the number of
configurations, I recommend to always set it to min(32768, memory_reserve / 2)
in this experiment.
Let's only look at the 2 GiB memory limit, but keep the variation of the reserve
memory.
|
msg3492 (view) |
Author: malte |
Date: 2014-09-21.17:49:29 |
|
Well, this is tricky stuff. Static vs. dynamic compilation doesn't make a
difference. What *does* seem to make a difference is *when* the memory reserve
is allocated.
With our new code, the memory reserve is no longer allocated with mmap, even if
we set it to the rather large 512 KiB. It seems that the reason is because we
allocate it later during execution. Before allocating the memory reserve, we
read the input, for which a buffer of 1 MiB is allocated but not released. It
seems that this buffer is then reused by general memory management, and hence
our reserve buffer is allocated internally rather than via mmap.
So to reproduce the problematic behaviour, we should allocate the reserve memory
before reading the input. By making this change, I have been able to see the
mmap/munmap behaviour again (which means that our released memory buffer doesn't
help), but even then I've had a hard time reproducing the problem. I think the
reason for this is that even without usable reserve memory, we can get lucky and
have 8K available memory if the last failed memory request required more memory
than this.
|
msg3489 (view) |
Author: malte |
Date: 2014-09-21.15:13:02 |
|
> One possible explanation is that starting the task without an external address
> space limit causes the mmap area to be allocated further out than it is with
> the 2 GiB external limit, and hence we don't hit it any more. This should be
> quite easy to test with strace and some manual runs. I'll look into it once I
> know where to find the experiment.
No, that's not the case. Both with an internally and an externally set limit, at
some point we hit the mmapped area and brk() starts failing.
There's an important difference in the strace compared to the previous
experiment: once we run out of memory and release the memory buffer, there is no
more memory management activity with the OS at all until termination: no munmap
to release the memory buffer to the OS and no attempt to obtain more memory
again with brk or mmap2.
I think this means that we keep the memory from the buffer around internally
now, and hence when we need memory later for determining peak memory, it is
available in our internal memory pool.
Regarding why the difference in behaviour, I think the reason is due to the way
we compile now: previously we used a dynamically linked debug compile. Now we
use a statically linked release compile. Presumably the memory management is
different in the two cases, perhaps because the machine on which we compile has
slightly different library versions than the ones on which we run. I'll try to
test this hypothesis.
|
msg3488 (view) |
Author: malte |
Date: 2014-09-21.14:19:30 |
|
> Where can I find the logs on maia?
Never mind, found it. (Of course it's lying next to the other experiment.)
There seems to be not a single case where we get the warning that peak memory
could not be determined, which supports the theory that the memory environment
is different here in some way. (We had 194 such warnings in the previous
experiment.)
|
msg3487 (view) |
Author: malte |
Date: 2014-09-21.13:45:41 |
|
Well, it's good to at least see part of what we're expecting to see. :-) The
disappearance of the errors with the 512 KiB reserve is odd indeed, though.
Where can I find the logs on maia?
Some considerations:
1. At first glance, the code looks good.
2. Apart from the failed assertions, another symptom of the same issue is when
the code emits a warning that it cannot determine the peak memory limit. We have
ignored this in the past, but it's really just a slightly different failure mode
for the same issue (running out of memory when trying to read the status file,
but failing more gracefully). In the past, I think the reports always showed the
"memory" value, while now they only seem to show them for successful tasks. So
we cannot find these "-1" cases. Would it be possibly to either add a new
attribute that always shows the memory value or change the existing one to also
include failed tasks? (I think we removed the values for failed cases mainly
because of these -1 values.) In the medium term, I think it would be good to
have several attributes (which don't have to be part of the report by default,
of course). Depending on situation, useful information could include:
- peak memory for successful tasks, nothing for others
- raw peak memory
- a binary attribute indicating if we ran out of memory (our existing
exitcode-based attributes might already serve this purpose)
[After I wrote this, I like the option of leaving the existing memory attribute
as is and adding a "raw peak memory" attribute that always contains the peak
memory info from the log, including failed runs and cases where peak memory is
reported as -1.]
3. It's possible that we're just luckier this time and our sample size of 50
tasks is too small. But if I recall correctly, we had multiple failures with
Airport before, so this does seem a bit unlikely. One possible explanation is
that starting the task without an external address space limit causes the mmap
area to be allocated further out than it is with the 2 GiB external limit, and
hence we don't hit it any more. This should be quite easy to test with strace
and some manual runs. I'll look into it once I know where to find the experiment.
|
msg3486 (view) |
Author: florian |
Date: 2014-09-21.13:11:33 |
|
I ran experiments for all combinations and the results where a bit unexpected.
Using a memory reserve <= 8 KiB leads to the expected error in about half the
tasks but there were no assertion errors for larger reserves.
Limit Reserve Unexplained errors
2 GiB 8 KiB 24
2 GiB 4 KiB 27
1 GiB 8 KiB 26
1 GiB 4 KiB 29
64 MiB 8 KiB 25
64 MiB 4 KiB 31
256 MiB 8 KiB 31
256 MiB 4 KiB 32
Full report:
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue469-airport-buffered.html
I guess there should at least be errors for the original setting of 2 GiB and
512 KiB, so I think I might have introduced a bug with the parameters. I ran the
experiment with a 3 GiB limit from lab, so only the internal limit should have
mattered. Could you have a look at the diff?
https://bitbucket.org/flogo/downward-issues-issue469/branch/issue469-experimental
|
msg3478 (view) |
Author: jendrik |
Date: 2014-09-20.11:22:16 |
|
Interesting stuff!
I wanted to work on issue414 over the weekend. Will let you know how it goes.
|
msg3477 (view) |
Author: malte |
Date: 2014-09-20.02:43:27 |
|
PS: Disabling buffering when reading the status file (with pubsetbuf like here:
http://stackoverflow.com/questions/16605233/how-to-disable-buffering-on-a-stream) eliminated
essentially all the memory use. Strangely, though, when report peak memory, the
program now seems to use *less* memory when doing nothing. This seems to be
either a flaw with the methodology or the code triggers the release of some
internal buffer of the library. Whatever happens, it may be a good idea to
disable the buffering here, considering that this code is usually called in
low-memory conditions.
> Regarding open/write/close: I haven't made changes to the code yet, as I have
> mostly been looking at the strace output. If you can run additional experiments
> with different memory limits and memory reserves, I can take care of this, though.
I now think that we've seen all we need to see from this output, so I'd rather
not do this and instead suggest to rip the code out again completely.
So my suggestion is to (in the experimental brranch):
- remove these output lines again (or not, they don't really harm either)
- change the Makefile to remove the NDEBUG flag in release mode
- add the options for limiting memory and for setting the amount of reserve memory
and then run an experiment on the Airport tasks with blind search. At this point
I don't yet want to remove the buffering because we want to see the effect of
the different memory limits and reserves and might lose this if we remove the
buffering.
Once we have results for this, the next step could be to see what happens if we
remove the buffering. Does this make sense? Can you set up such an experiment?
(Happy to help if desired.)
Based on what we know now, I'd suggest trying the memory limits:
- memory limits: 2 GiB (careful, doesn't fit into an int -- does fit into a size_t),
1 GiB, 256 MiB, 64 MiB
- memory reserves: 2 MiB, 1 MiB, 512 KiB (our current amount), 256 KiB, 128 KiB,
64 KiB, 32 KiB, 16 KiB, 8 KiB, 4 KiB
I don't think we need all 40 combinations (= 2000 tasks), but I guess it does't
really hurt to use them either. The experiment should be rather quick in any case.
|
msg3476 (view) |
Author: malte |
Date: 2014-09-20.02:20:32 |
|
OK, ltrace didn't really help (and dtrace in the other message was a typo). It
does contain much information about allocations and such, but not in the parts
we're interested in. I don't really know why; perhaps much of this stuff is
inlined, perhaps the tracing doesn't catch calls within a given library (which
makes sense from the way I understand it works).
Instead I used a different method for estimating how much memory we need:
writing a simple program that exhausts memory and checking how soon it is
exhausted if we include print_peak_memory() vs. if we don't. The program is
attached and also includes an example on how to set a memory limit from within C++.
To test it in a "realistic" setting, it is meant to replace the planner main
file. (Everything else apart from utilities.h can then be removed from the
Makefile.) I've tried it with two different kinds of allocations, and in both
cases, I came up with roughly 8.5 additional memory KiB needed when we use
print_peak_memory.
|
msg3475 (view) |
Author: malte |
Date: 2014-09-20.01:05:04 |
|
PS: Instead of hacking the revision cache, I suggest removing "-DNDEBUG" from
the Makefile in the experimental branch. Then we can use release mode directly.
(Running experiments in debug mode is also something that will become easier
once issue414 is merged and one lab is adapted to use the new script.)
|
msg3474 (view) |
Author: malte |
Date: 2014-09-20.01:03:20 |
|
> For the different memory reserves, I guess we need a parameter (or hack it
> into the code before compiling).
I think a parameter in the code is perhaps the easiest way if we only want the
functionality in the experimental branch. See for example how the random seed is
set ("--random-seed" in option_parser.cc).
> For different memory limits, there is no way around a separate experiment for
> each limit. They can be set up by adding the parameter "limits={'search_memory':
> 1024}" to the experiment.
We can support this better once issue414 is merged, which is currently waiting
for portfolio integration. (Jendrik, do you have time for this in the near
future? Otherwise I can also try to take over the issue again.)
In the meantime, for this issue, it's perhaps easiest to simply set a memory
limit from within the program based on an option inside the planner, just like
the option for the memory reserve. See the second-last message on
http://www.linuxquestions.org/questions/programming-9/setrlimit-81206/ for an
example. (We want to limit RLIMIT_AS.)
|
msg3473 (view) |
Author: malte |
Date: 2014-09-20.00:55:34 |
|
10 * 2 * 50 is quite a bit more than 384, though, while 10 * 2 * 15 would be
less. But I don't mind running all 50 airport tasks if that saves work for us.
Actually, for simplicity, for now I think it's enough to run each airport task
only once. We're no longer stabbing completely in the dark (see below), and to
test some quick hypotheses, I think running each task only once will give us
enough chances to trigger bogus behaviour. But once we have something that seems
to work more reliably, perhaps we might want to do a more thorough test (or
perhaps running all tasks would be that thorough test).
Regarding open/write/close: I haven't made changes to the code yet, as I have
mostly been looking at the strace output. If you can run additional experiments
with different memory limits and memory reserves, I can take care of this, though.
The strace output reveals some interesting things. I ran the strace call from my
previous email with memory limits (ulimit -v) of 64 MiB, 128 MiB, 256 MiB, 512
MiB, 1 GiB and 2 GiB.
Judging from the strace output, memory management on the maia machines seems to
work roughly as follows:
1. "Small" memory requests are handled by the C++ standard library internally.
It parcels the heap, and when it runs out of space, it calls brk() to extend the
heap, usually by 132-136 KiB. (Presumably it's something like: whatever we need
straight away plus 128 KiB for the future, in increments of 4 KiB, which is the
page size.) Hence, when small objects are released, their memory should be
usable directly by other small objects, without the OS being involved in any
way. (Of course, memory fragmentation is an issue here. If we release two 1 KiB
objects, this doesn't give us space for a 2 KiB object, unless they happened to
be adjacent in memory and memory management detects this, which I assume it would.)
2. When brk() fails for some reason, it attempts fallback measures for providing
memory for small objects despite being unable to increase the heap.
Specifically, mmap2() is called with various options to get chunks of 1 MiB or 2
MiB of memory. If these also fail, the out-of-memory handler is called.
3. "Large" memory requests (not sure how large they need to be to count as
large, but from the strace output, 196 KiB seems to be large enough -- also see
below) are satisfied directly by mmap2(). So for large objects, we go directly
to the OS, and when the memory for such an object is released, the memory is
directly returned to the OS with munmap(). For example, for the 1 GiB run,
roughly 48 MiB -- almost 5% of the usable memory -- seem to be needed in one
contiguous chunk at some point. This is likely the main hash table of the global
state registry. With more than 5 million evaluated states at that point, this
seems reasonable.
Here is what strace tells us about the cases when we run the task with a memory
limit of 1 GiB or below (none of these showed the error):
We see many brk() calls, using up the available memory in chunks of roughly
132-136 KiB. These are interspersed with the occasional mmap2()/munmap() for
large objects. At some point brk() cannot provide more memory, and neither can
the mmap2() fallback.
At this point, our out-of-memory handler releases our reserve memory. This is a
large object (512 KiB), so it was allocated with mmap2() rather than brk(). This
means that the memory it frees is released with munmap() and cannot be used
directly by the program; it is returned to the OS. However, this means we are
now again 512 KiB (actually 516 KiB because the mapped memory is a bit larger,
presumably for book-keeping information) below our memory limit, so the next
brk() attempt succeeds as there is available memory again.
One further brk() call providing 132 KiB is enough to gracefully take the
program to termination. In most cases, this brk() call is needed right after the
call to open the status file from which we read the used memory. In other cases,
it's right before this call. In any case, right around this call seems to be
where we need additional memory (presumably for buffering, but perhaps not
exclusively for buffering given that brk() isn't always called from exactly the
same place).
Here is what is different with the 2 GiB limit:
After using around 1.3-1.4 GiB, in other words long before we reach the memory
ulimit, the brk() calls start failing. At this point, memory management starts
using mmap2() with chunks of 1 MiB instead after each failed called to brk().
This continues until we run out of memory and release our reserved memory block.
After this, brk() still fails, and the fallback mmap2() also fails because it
wants a chunk of 1 MiB but we only released roughly half of that. We get the
recursive invocations of the handler, and at some point these lead to a
segmentation fault. (I don't really know why. Perhaps we run out of stack space,
although for this the number of recursive invocations appears low. Perhaps the
kernel detects that the SIGABRTs from the failed assertion trigger the signal
handler recursively and doesn't like this.)
So there are two main problems here:
1. We return the reserve memory directly to the OS rather than keeping it around
for our own use.
2. When we ask the OS for more memory after returning it some, we ask for a
larger chunk than what is available.
Problem #2 could presumably be fixed by reserving 1 MiB or more of memory, but
this does appear somewhat wasteful (it's probably 1000x more than what is
actually needed) and hence I think this would be a pity.
Problem #1 could perhaps be fixed by using a *smaller* reserve memory, less than
the amount where allocations go straight to mmap2(). I think that something like
32 KiB might do the trick. (Some googling shows that 128 KiB might be the magic
number. [I have since confirmed this -- "man 3 malloc", search for
"MMAP_THRESHOLD".])
None of these is a stable solution of course, as the magic constants of 1 MiB,
128 KiB or whatever can change at any time. But if they work, they would be good
enough for me. Alternatively, we could also try to implement our out-of-memory
handlers in a way that they don't need dynamic memory at all. This should be
quite possible, too, by avoiding going through C++ libraries and the iostream
library there. (It's not that we do terribly complicated processing there.)
One interesting question is why brk() starts failing in the 2 GiB case. Heap
space must be contiguous, and my guess was that at some point we can't linearly
increase the heap because we hit a part of the address space that is already in
use by mmap.
Going through the strace output for the 2 GiB case, the last successful brk()
call sets the heap barrier to address 0x55536000. Extending it to 0x55557000
fails. Looking at the strace() output again, we see that address 0x55576000 and
later are used for mmap2() calls -- these addresses are corect for *all* memory
limits, not just the 2 GiB limit. This is awfully close to 0x55557000, so indeed
there is a limit there that brk() cannot go beyond. With the small memory
limits, this means that the initial brk() value (which was 0xb0c5000 or less for
all memory limits) and the mmap() area are far enough apart that there is always
enough space to grow the heap. The difference between 0x55576000 and 0xb0c5000
is more than 1 GiB, so heap and mmap area will never collide. However, with our
2 GiB memory limit, they can. (Presumably the kernel's heuristic for placing
these things is good enough for all but the most memory-hungry applications.)
Even if all this is correct, one slightly unsatisfying thing that remains is
that we have to guess how much of a memory reserve we actually need in order to
terminate the program gracefully. To gather more information about this, strace
is not good enough -- we would also need to trace the allocations within our
program that don't hit the OS to see how much memory is actually asked for. I
will see if I can find out some relevant information with dtrace.
|
msg3472 (view) |
Author: florian |
Date: 2014-09-19.20:34:53 |
|
As far as I know, the easiest way to run a config more than once is to add it
multiple times with different config nicks. Running all airport tasks is also
slightly easier to set up than only 15 tasks, so as long as we are staying under
384 tasks per experiment, we may want to stick with the full domain.
For different memory limits, there is no way around a separate experiment for
each limit. They can be set up by adding the parameter "limits={'search_memory':
1024}" to the experiment.
For the different memory reserves, I guess we need a parameter (or hack it into
the code before compiling).
If I should do some of these changes, could you push your updated code to the
bitbucket repository?
|
msg3471 (view) |
Author: malte |
Date: 2014-09-19.20:07:31 |
|
I agree running just Airport is enough. The results don't seem to be well
reproducible, though, so ideally I think later exploratory experiments should
perform multiple runs (10?) for each task if this feasible. (If not, perhaps we
can also just live with the randomness, but then I'd rather have a larger task
set to make sure we do catch the cases that only strike somewhat rarely.)
Only blind search and Airport seems fine, and I'd limit it even further to just
those 15 airport tasks for which we've seen problems if we can do the 10
repetitions.
Here are some things that I think might convey some useful information. (This
doesn't mean we have to run all these:)
- Trying different memory limits. We currently use 2 GiB. I'd also be interested
in results for 1 GiB, 256 MiB or 64 MiB.
- Trying different sizes for the memory reserve. I'd be interested what happens
with 1000 KiB, 1 MB, 2 MB, 256 KiB, 128 KiB, 32 KiB, or 8 KiB. (There are
reasons why the small limits might behave differently in unintuitive ways --
will elaborate if we see something interesting.)
|
msg3470 (view) |
Author: malte |
Date: 2014-09-19.19:54:51 |
|
> Sorry about the wrong functions, I'm not at all familiar with these low-level
> functions and thought fputs would be more low-level than a file stream. However,
> there are outputs for entering the new_handler. If fputs allocates memory,
> shouldn't these immediately lead to an endless loop? Anyway, you should have
> write access to the repository, so you can push any changes there.
No problem! Just thought it would be good to clarify. (C-style file output is
generally buffered, just like C++ output is. I'm not sure if there are any
meaningful differences between C stdio and C++ iostreams apart from the fact
that one interface is procedural and the other is object-oriented.)
> I didn't try to reproduce the effects locally for this issue. Last time I looked
> at this code, I ran experiments to determine the overhead of the python wrapper.
> Back then, I tried local experiments and had no problems locally. The grid
> seemed to behave differently. Maybe this was because of the older
> compiler/library versions? Did the static build make a difference for the memory
> issues?
Memory management is part of the standard library and hence glibc-dependent. (Or
whatever the C++ equivalent is called, but I assume it defers to glibc at the
end.) So it could well be something about different library versions.
The main reason I wanted a static compile is because I wanted to work on my
machine. Only statically linked binaries from my machine work on maia. (Apart
from this point, static vs. dynamic linking may make a difference in so far as
statically linked executables will use libraries from the compiling machine,
while dynamically linked ones will use libraries from the executing machine.
Also, static linking will prevent using tools like ltrace, if we want to try
that out later.)
|
msg3469 (view) |
Author: florian |
Date: 2014-09-19.19:48:17 |
|
About the grid experiment: from the report it looks like running one domain,
e.g. airport, should be enough. This would make the experiment much faster.
|
msg3468 (view) |
Author: malte |
Date: 2014-09-19.19:47:31 |
|
Thanks for all the data, Florian! I had a somewhat closer look. I have been able
to reproduce this on maia, outside of lab, on the ase01 machine (not recommended
to use it directly unless we know it's not being used at the moment), as follows:
ulimit -v 2097152
/infai/pommeren/projects/downward/issues/issue469/experiments/issue469/data/issue469-blind/code-3372a2d5249d/search/downward-release
--search 'astar(blind())' <
/infai/pommeren/lab/preprocessed-tasks/3372a2d5249d-3372a2d5249d/airport/p18-airport3-p6.pddl/output
This is an Airport task, the first failing one in Florian's report.
I verified that the machine was otherwise not busy, so it cannot be an issue
with the Linux OOM killer or something similar. (I guess this would give very
different output anyway, just thought it might be good to mention this.)
However, "reproduce" is perhaps saying too much. Repeating the above command 10
times, the outcome was:
- 3x failed assertion
- 4x warning that peak memory couldn't be determined
- 3x success (including correct peak memory output)
It would be nice to reproduce it on a local machine; maybe this is possible with
a bit more persistence. i had no luck reproducing it with a lower memory limit.
I then tried tracing the program with strace. The first thing this showed is
that we call "times" all the time, making unfiltered stracing impossible. I
think it would be useful to change this behaviour. Is this perhaps the search
timeout? If yes, it might be a good idea to change the code to only look at the
time if a finite timeout has been set. I then tried a filtered strace:
strace -etrace=\!times
/infai/pommeren/projects/downward/issues/issue469/experiments/issue469/data/issue469-blind/code-3372a2d5249d/search/downward-release
--search 'astar(blind())' <
/infai/pommeren/lab/preprocessed-tasks/3372a2d5249d-3372a2d5249d/airport/p18-airport3-p6.pddl/output.
(To be continued -- waiting for some more data from a second strace run and
don't want to lose the partially typed message in the meantime if something bad
happens. :-))
|
msg3467 (view) |
Author: florian |
Date: 2014-09-19.19:45:39 |
|
Sorry about the wrong functions, I'm not at all familiar with these low-level
functions and thought fputs would be more low-level than a file stream. However,
there are outputs for entering the new_handler. If fputs allocates memory,
shouldn't these immediately lead to an endless loop? Anyway, you should have
write access to the repository, so you can push any changes there.
I didn't try to reproduce the effects locally for this issue. Last time I looked
at this code, I ran experiments to determine the overhead of the python wrapper.
Back then, I tried local experiments and had no problems locally. The grid
seemed to behave differently. Maybe this was because of the older
compiler/library versions? Did the static build make a difference for the memory
issues?
|
msg3466 (view) |
Author: malte |
Date: 2014-09-19.18:19:44 |
|
Can't reproduce this locally. I guess that's the same for you?
Experimenting this is made a bit more difficult for me because the debug build
is not static. We already have an option to control whether the release build is
static -- maybe we can have a similar option for the debug build in the future,
too. For now, I've just removed "-DNDEBUG" from the release build locally so
that I can test this with a release build.
|
msg3465 (view) |
Author: malte |
Date: 2014-09-19.18:04:48 |
|
Thanks! Had a brief look at the code. When I wrote this:
> maybe we can avoid such issues by writing out debugging information with
> low-level Unix commands (open/write/close) directly to files rather than to
> stdio/stderr and making such we're not using heap memory there.
I really meant it. fopen/fclose can interact with program-internal memory
management in the same way as the iostream library. I'll try with
open/write/close instead and see if it makes a difference. (Perhaps not, but we
should rule out our debugging statements as sources of error.)
|
msg3464 (view) |
Author: florian |
Date: 2014-09-19.18:02:15 |
|
The debug code is in the branch issue469-experimental which I planned not to
push to the main repository.
https://bitbucket.org/flogo/downward-issues-issue469/branch/issue469-experimental
To run the experiment in debug mode, you need to switch the binary in the
revision cache before building the experiment (make debug && mv downward-debug
downward-release). Unfortunately, this is the easiest option for debug
experiments at the moment.
You should also be able to access my log files on maia at
/infai/pommeren/projects/downward/issues/issue469/experiments/issue469/data/issue469-blind
I uploaded the report that contains run_dirs for all unexplained errors to
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue469-blind.html#unexplained-errors
|
msg3462 (view) |
Author: malte |
Date: 2014-09-19.17:55:06 |
|
> I think the stream constructor is the more likely problem, but it seems strange
> that it should run out of memory.
Indeed. Even if it were fully buffered by the ifstream, /proc/self/status should
only be roughly 1K in size.
> This loop is repeated a few times (5 in the case I looked at) and then stops in
> the memory handler for no apparent reason. Maybe the process is killed by the
> system?
This appears a bit odd, too.
> It looks like the condition would always evaluate to false, but in the second
> nested call, it evaluates to true.
Maybe the confusion is about how static variables of functions work? Their
initialization (in this case to 0) is only performed *the first time* the
function is called. After that, they retain their value from function
application to function application. They are essentially global variables with
function scope and delayed initialization (only initialized when their function
is first called).
Can you give me access to your test code for this?
|
msg3460 (view) |
Author: florian |
Date: 2014-09-19.17:46:12 |
|
I peppered utilities.cc with debug output and ran an experiment.
Steps 1) through 7) are called as intended, the second failed call to new comes
from one of the following two lines. The debug output above the first line was
printed and instead of printing the debug output below the second line, the
output for entering the new_handler function was printed.
ifstream procfile("/proc/self/status");
string word;
I think the stream constructor is the more likely problem, but it seems strange
that it should run out of memory.
After that, the output shows we are running into a loop:
The new_handler aborts because of the failed assertion
The signal_handler catches the signal and tries to output peak memory
get_peak_memory_in_kb tries to allocate memory and control ends up in the
new_handler again
This loop is repeated a few times (5 in the case I looked at) and then stops in
the memory handler for no apparent reason. Maybe the process is killed by the
system?
Another thing I found strange in the code was in the signal_handler. It seems to
work, but I do not understand why:
static volatile sig_atomic_t handler_in_progress = 0;
if (handler_in_progress) {
raise(signal_number);
}
handler_in_progress = 1;
It looks like the condition would always evaluate to false, but in the second
nested call, it evaluates to true.
|
msg3446 (view) |
Author: malte |
Date: 2014-09-18.18:35:25 |
|
One more thing: we cannot really trust that output on stdout/stderr corresponds
to the intended output here. Things can get lost, duplicated, or appear out of
order in interrupted code like this due to interactions with buffering. (stderr
is probably safer than stdout here since it is usually not buffered, but I don't
know how redirection I/O to a file as we do affects this.)
|
msg3445 (view) |
Author: malte |
Date: 2014-09-18.18:33:05 |
|
Signal 6 is SIGABRT, which is (among other things) raised by failed assertions.
It is also raised by our ABORT macro, but these cases should be quite obvious.
512 KB of memory is a lot for doing a tiny bit of I/O, so I think the first
thing we should do is find out where this goes wrong. The tricky bit is that
adding debug output might itself trigger an out of memory condition... maybe we
can avoid such issues by writing out debugging information with low-level Unix
commands (open/write/close) directly to files rather than to stdio/stderr and
making such we're not using heap memory there.
I think it might be a good idea to add a tracing line start from right before
"handler_in_progress = 1" in signal_handler and after each following line in
this function and the functions it calls (directly or indirectly), and also
before/after every line in out_of_memory_handler and exit_with. This is a bit
awkward, but probably still one of the simplest ways to find out what is going
on here.
|
msg3442 (view) |
Author: florian |
Date: 2014-09-18.17:51:08 |
|
When running in debug mode on the grid, we got a lot of cases (~2500) where our
out-of-memory-handler is called more than once which violates an assertion. The
following steps are supposed to happen when the planner runs out of memory:
1) 'new' tries to reserve memory and fails.
2) Our new_handler 'out_of_memory_handler()' is called.
3) The memory handler asserts that the variable 'memory_padding' is non-zero.
4) The memory handler frees the memory reserved by 'memory_padding' and sets the
variable to zero.
5) Debug output: "Failed to allocate memory. Released memory buffer."
6) exit_with(EXIT_OUT_OF_MEMORY) outputs: "Memory limit has been reached." and
calls 'exit(EXIT_OUT_OF_MEMORY)'
7) 'exit_handler()' is called which in turn calls 'print_peak_memory()'
From the output it looks like steps 1) to 6) work:
stdout:
=======
Failed to allocate memory. Released memory buffer.
Memory limit has been reached.
Peak memory: -1 KB
caught signal 6 -- exiting
=======
stderr:
=======
downward-release: utilities.cc:91: void out_of_memory_handler(): Assertion
`memory_padding' failed.
downward-release: utilities.cc:91: void out_of_memory_handler(): Assertion
`memory_padding' failed.
Unexpected error.
warning: could not determine peak memory
../../code-a38b046fe1e5/search/downward: line 193: 61001 Aborted
=======
The output "caught signal 6 -- exiting" comes from the 'signal_handler()' which
should not be called. I'm not sure whether this is caused by
'exit(EXIT_OUT_OF_MEMORY)' (EXIT_OUT_OF_MEMORY = 6) or whether this is caused by
the failed assertion. As far as I know, exit code 6 is not the same as signal 6,
so the second option seems more likely to me.
One explanation could be that step 6) runs out of memory and starts step 1)
again. We currently have a memory padding of 512 * 1024 chars.
|
|
Date |
User |
Action |
Args |
2014-09-25 15:38:17 | jendrik | set | messages:
+ msg3545 |
2014-09-24 13:23:18 | malte | set | messages:
+ msg3527 |
2014-09-24 11:13:03 | jendrik | set | messages:
+ msg3523 |
2014-09-24 00:56:09 | malte | set | messages:
+ msg3519 |
2014-09-23 22:09:08 | jendrik | set | messages:
+ msg3517 |
2014-09-23 19:34:28 | malte | set | messages:
+ msg3515 |
2014-09-23 19:12:58 | florian | set | messages:
+ msg3513 |
2014-09-23 18:43:42 | malte | set | messages:
+ msg3512 |
2014-09-23 18:41:58 | malte | set | messages:
+ msg3511 |
2014-09-23 18:38:41 | florian | set | status: chatting -> resolved messages:
+ msg3510 |
2014-09-23 17:06:03 | florian | set | messages:
+ msg3508 |
2014-09-22 22:38:49 | florian | set | assignedto: florian |
2014-09-22 22:38:14 | florian | set | messages:
+ msg3501 |
2014-09-21 18:32:14 | malte | set | messages:
+ msg3493 |
2014-09-21 17:49:29 | malte | set | messages:
+ msg3492 |
2014-09-21 15:13:02 | malte | set | messages:
+ msg3489 |
2014-09-21 14:19:30 | malte | set | messages:
+ msg3488 |
2014-09-21 13:45:41 | malte | set | messages:
+ msg3487 |
2014-09-21 13:11:33 | florian | set | messages:
+ msg3486 |
2014-09-20 11:22:17 | jendrik | set | messages:
+ msg3478 |
2014-09-20 02:43:27 | malte | set | messages:
+ msg3477 |
2014-09-20 02:20:32 | malte | set | files:
+ planner.cc messages:
+ msg3476 |
2014-09-20 01:05:04 | malte | set | messages:
+ msg3475 |
2014-09-20 01:03:20 | malte | set | messages:
+ msg3474 |
2014-09-20 00:55:34 | malte | set | messages:
+ msg3473 |
2014-09-19 20:34:53 | florian | set | messages:
+ msg3472 |
2014-09-19 20:07:31 | malte | set | messages:
+ msg3471 |
2014-09-19 19:54:51 | malte | set | messages:
+ msg3470 |
2014-09-19 19:48:17 | florian | set | messages:
+ msg3469 |
2014-09-19 19:47:31 | malte | set | messages:
+ msg3468 |
2014-09-19 19:45:39 | florian | set | messages:
+ msg3467 |
2014-09-19 18:19:44 | malte | set | messages:
+ msg3466 |
2014-09-19 18:04:48 | malte | set | messages:
+ msg3465 |
2014-09-19 18:02:15 | florian | set | messages:
+ msg3464 |
2014-09-19 17:55:06 | malte | set | messages:
+ msg3462 |
2014-09-19 17:46:12 | florian | set | messages:
+ msg3460 |
2014-09-18 18:45:13 | jendrik | set | nosy:
+ jendrik |
2014-09-18 18:35:25 | malte | set | messages:
+ msg3446 |
2014-09-18 18:33:05 | malte | set | status: unread -> chatting messages:
+ msg3445 |
2014-09-18 17:51:08 | florian | create | |
|