Issue744

Title reduce f-value output for tasks with high and diverse action costs
Priority feature Status resolved
Superseder Nosy List jendrik, malte, manuel, silvan
Assigned To silvan Keywords
Optional summary
part of issue746

Created on 2017-11-15.09:06:12 by jendrik, last changed by silvan.

Summary
part of issue746
Messages
msg8886 (view) Author: silvan Date: 2019-06-12.17:04:19
Done.
msg8885 (view) Author: malte Date: 2019-06-12.16:07:52
That's a nice result! I expected a larger savings in the experiment sizes, but
it is what it is. Happy to consider this one done.
msg8877 (view) Author: silvan Date: 2019-06-12.11:03:38
Results:
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue744-v1-opt-30min.html
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue744-v1-sat-30min.html
(ignore the parser errors; I re-parsed)

Size reports:
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue744-v1-opt-30min-normal.html
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue744-v1-opt-30min-silent.html
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue744-v1-sat-30min-normal.html
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue744-v1-sat-30min-silent.html

In particular in the sat case, the largest log files got reduced in size by
roughly a factor of 10.

Size of directories (only the run dirs):

[sieverss@flsrv40 issue744-v1-opt-30min]$ du -sch runs-*
702M  runs-normal
528M  runs-silent
1.3G  total

[sieverss@flsrv40 issue744-v1-sat-30min]$ du -sch runs-*
712M	runs-normal
561M	runs-silent
1.3G	total
msg8866 (view) Author: malte Date: 2019-06-07.12:09:41
I would like to see an experiment on this. Experiments are not just to verify
that what we did works and makes sense; it's also to quantify the impact. Can we
rerun the original experiment with the new code? I'd also be interested in the
result of something along the lines of

$ du -sch old-experiment-data
$ du -sch new-experiment-data

so that we have an idea how much disk space this change saves us (which is
ultimately the reason why we care about log file size).
msg8862 (view) Author: silvan Date: 2019-06-07.11:32:38
Remove TODOs.
msg8861 (view) Author: silvan Date: 2019-06-07.11:29:31
Merged this one, thanks everyone.

I opened issue921 to discuss how to proceed with reducing output of too verbose
heuristics.

I'll send an email to the group to advertise the new verbosity option for all
future experiments on our grid.
msg8858 (view) Author: malte Date: 2019-06-06.21:40:33
Looks good to me. Bitbucket pipelines complain, though, I think because of
uncrustify. (Didn't check in detail.)
msg8855 (view) Author: silvan Date: 2019-06-06.18:37:10
Thanks! I'll wait for Malte to have another look.

We also shouldn't forget to communicate to everyone running experiments on our
cluster to use verbosity=silent once this is merged as to reduce disk laod.
msg8853 (view) Author: jendrik Date: 2019-06-06.18:01:46
I looked at the individual commits and left some comments there.
msg8852 (view) Author: silvan Date: 2019-06-06.17:54:45
I added the verbosity flag directly to SearchEngine as suggested. Not only
allowed this to add it to SearchStatistics, too, but also to SearchProgress,
which is the exclusive place that prints "New best heuristic value..." using the
evaluators. Thus, to control all output that is generated *during* search, one
can now use the verbosity option of search engines. Using verbosity=silent will
not print any type of statistics or notion of progress during search, but still
print the usual statistics at the end. Both TODOs for future work in the summary
are addressed with this.

What still remains to do (or to be discussed, at least), is to add a verbosity
option for heuristics, too, that would control the amount of output during
heuristic precomputations/computations. For merge-and-shrink, this option
already exists. I think that someone suggested to also reduce the noise of iPDB,
for example. I added this to the summary.

This is now ready for another round of review.
msg8840 (view) Author: silvan Date: 2019-06-06.13:29:30
I'm done addressing the comments.
msg8839 (view) Author: silvan Date: 2019-06-06.13:29:22
Add to summary:
- open a follow-up issue for reducing search output of lazy search and enforced
hill climbing.
msg8830 (view) Author: malte Date: 2019-06-06.11:04:10
> Even if that means to expose an option to, say, lazy greedy search that simply
> doesn't do anything? :-) I think we should at least document this.

I saw the many identical calls to "add_verbosity_options" and was under the
misapprehension that some of these were for different search algorithms. (And I
remember that we also want to reduce output also for lazy search and EHC, but I
see now that the plan is that this will be controlled by the evaluators, not the
search algorithms themselves.)

It looks like all these identical calls were for different eager plug-ins. In
this case, the suggestion is to refactor this common code so that the different
eager plug-ins don't violate DRY so much.
msg8829 (view) Author: silvan Date: 2019-06-06.09:56:14
Even if that means to expose an option to, say, lazy greedy search that simply
doesn't do anything? :-) I think we should at least document this.
msg8827 (view) Author: jendrik Date: 2019-06-06.07:52:34
I think having the option available for all search engines makes sense.
msg8825 (view) Author: silvan Date: 2019-06-05.23:44:26
I wondered the same while implementing, but thought that there wouldn't be much
value in having the option for all search engines as long as we only "support"
(aka. use) it in eager search. We can still change it, though, if you prefer.
msg8817 (view) Author: malte Date: 2019-06-05.19:27:38
I left a few comments on bitbucket and like the patch in general.

In many places, we have the combination

     SearchEngine::add_options_to_parser(parser);
+    utils::add_verbosity_options_to_parser(parser);

in the diff. I wonder if verbosity should not be a general search engine option,
so that it would only need to be added in SearchEngine::add_options_to_parser.

We discussed offline that we're not necessarily committing to this design in the
long term (as I recall Jendrik favoured explicitly using loggers and using
different names for the verbosity levels; I also think something more
logger-based is the better long-term solution). But for now we want to address
the immediate problem of too much output without having to solve the larger
question of a sane logging strategy.
msg8815 (view) Author: silvan Date: 2019-06-05.18:00:51
I moved the Verbosity enum from M&S to logging and added a verbosity option for
eager search, disabling output of search progress statistics if verbosity=silent.

Pull request: https://bitbucket.org/SilvanS/fd-dev/pull-requests/51/issue744/diff
msg8812 (view) Author: silvan Date: 2019-06-05.15:31:32
We discussed offline to defer addressing too much output of the form "New best
heuristic..." to a future issue. In this issue, we will introduce a configurable
verbosity level to reduce output related to f-value progress.
msg8811 (view) Author: silvan Date: 2019-06-05.15:08:12
Looking at the output of some parcprinter problems, we realized that it depends
on the used heuristic which line is output most frequently: using A* with blind,
we get many lines of the form
f = 438047 [1 evaluated, 0 expanded, t=0.00396825s, 25372 KB

whereas using lmcut, we mostly get lines
New best heuristic value for lmcut: 430047
[g=8000, 4 evaluated, 2 expanded, t=0.00906805s, 25372 KB]

So it is not directly apparent that limiting the amount of printed lines "f =
..." directly resolves this problem.

Furthermore, we also started looking into the code and found that the different
output lines stem from different parts of the code. The "f = ..." and "[g=...]"
lines stem directly from eager search, but "New best heuristic value..." is
triggered in the search progress (indirectly, via a method called check_progress
that prints this line as a side effect, but only for heuristics, not all
evaluators). It is not directly clear if passing a parameter for limiting output
to an evaluator would be a clean solution, if we wanted to limit output of the
form "New best heuristic value..." and not only the f-value statistics as
originally assumed.
msg8801 (view) Author: malte Date: 2019-06-04.17:53:12
Sounds good, let's discuss it further offline! It may be useful to include
Silvan in the discuss if he is available because I think he already included
verboseness settings in some parts of the planner.
msg8799 (view) Author: manuel Date: 2019-06-04.17:34:17
I analyzed the log files and discovered some space dominating parts of the log
files.

Optimal planning (most significant issue first; size in number of characters
reflects the size of the largest log file where the issue arises):
  - f-values (size: 10'485'863)
  - ipdb improvement steps (size: 2'842'043)
  - mas steps (21 lines per step; size: 660'836)
  - cegar refinement steps (14 lines per step; size: 318'891)
  - divpot sampling steps (3 lines per step; size: 125'821)

Satisficing planning (both issues are equally significant):
  - h-values
  - plan

From my opinion, it is the right step to first limit the output of f-value and
h-value statistics. Moreover, I recommend to also provide options for reducing
the outputs of heuristics. I think the next step is to discuss possible changes
offline.

Data:
https://ai.dmi.unibas.ch/_tmp_files/heusner/issue744-base-opt-30min-sorted.html
https://ai.dmi.unibas.ch/_tmp_files/heusner/issue744-base-sat-30min-sorted.html
msg6614 (view) Author: jendrik Date: 2017-11-27.19:21:13
Once the statistics are separated from the actual search code more clearly, we 
would like to make statistics configurable. One parameter should allow limiting 
the maximum amount of f-value outputs.
msg6600 (view) Author: jendrik Date: 2017-11-15.09:06:12
Some parcprinter tasks produce huge logfiles, sometimes as large as 30 MiB (see 
http://ai.cs.unibas.ch/_tmp_files/seipp/parcprinter-opt11-strips-p10.run.log.xz). 
Almost all of the output is due to f-value statistics. We should probably make 
logging f-values less verbose for these tasks.
History
Date User Action Args
2019-06-12 17:04:19silvansetmessages: + msg8886
2019-06-12 16:07:52maltesetmessages: + msg8885
2019-06-12 11:03:38silvansetmessages: + msg8877
2019-06-07 12:09:41maltesetmessages: + msg8866
2019-06-07 11:32:38silvansetmessages: + msg8862
summary: part of issue746 TODO: - discuss/open an issue for adding support for verbosity levels for all heuristics. - send email to group/students to tell them to use verbosity=silent -> part of issue746
2019-06-07 11:29:31silvansetstatus: reviewing -> resolved
messages: + msg8861
2019-06-06 21:40:33maltesetmessages: + msg8858
2019-06-06 18:37:10silvansetmessages: + msg8855
summary: part of issue746 TODO: - discuss/open an issue for adding support for verbosity levels for all heuristics. -> part of issue746 TODO: - discuss/open an issue for adding support for verbosity levels for all heuristics. - send email to group/students to tell them to use verbosity=silent
2019-06-06 18:01:46jendriksetmessages: + msg8853
2019-06-06 17:54:45silvansetmessages: + msg8852
summary: part of issue746 TODO: - open a new issue to deal with too much output of the form "New best heuristics...". One idea is to set the flag "use_for_reporting_minima" of heuristics depending on the chosen verbosity level of the heuristic (to be added option). - open a follow-up issue for reducing search output of lazy search and enforced hill climbing. -> part of issue746 TODO: - discuss/open an issue for adding support for verbosity levels for all heuristics.
2019-06-06 13:29:30silvansetmessages: + msg8840
2019-06-06 13:29:22silvansetmessages: + msg8839
summary: part of issue746 TODO: - open a new issue to deal with too much output of the form "New best heuristics...". One idea is to set the flag "use_for_reporting_minima" of heuristics depending on the chosen verbosity level of the heuristic (to be added option). -> part of issue746 TODO: - open a new issue to deal with too much output of the form "New best heuristics...". One idea is to set the flag "use_for_reporting_minima" of heuristics depending on the chosen verbosity level of the heuristic (to be added option). - open a follow-up issue for reducing search output of lazy search and enforced hill climbing.
2019-06-06 11:04:10maltesetmessages: + msg8830
2019-06-06 09:56:14silvansetmessages: + msg8829
2019-06-06 07:52:34jendriksetmessages: + msg8827
2019-06-05 23:44:26silvansetmessages: + msg8825
2019-06-05 19:27:38maltesetmessages: + msg8817
2019-06-05 18:00:51silvansetstatus: chatting -> reviewing
assignedto: silvan
messages: + msg8815
2019-06-05 15:31:32silvansetmessages: + msg8812
summary: part of issue746 -> part of issue746 TODO: - open a new issue to deal with too much output of the form "New best heuristics...". One idea is to set the flag "use_for_reporting_minima" of heuristics depending on the chosen verbosity level of the heuristic (to be added option).
2019-06-05 15:08:12silvansetnosy: + silvan
messages: + msg8811
2019-06-04 17:53:12maltesetmessages: + msg8801
2019-06-04 17:34:17manuelsetnosy: + manuel
messages: + msg8799
2017-11-27 19:21:13jendriksetstatus: unread -> chatting
messages: + msg6614
summary: part of issue746
2017-11-15 09:06:12jendrikcreate