Issue 744: reduce f-value output for tasks with high and diverse action costs

Title	reduce f-value output for tasks with high and diverse action costs
Priority	feature	Status	resolved
Superseder		Nosy List	jendrik, malte, manuel, silvan
Assigned To	silvan	Keywords
Optional summary	part of issue746

Created on 2017-11-15.09:06:12 by jendrik, last changed by silvan.

Summary
part of issue746

Messages
msg8886 (view)	Author: silvan	Date: 2019-06-12.17:04:19
Done.
msg8885 (view)	Author: malte	Date: 2019-06-12.16:07:52
That's a nice result! I expected a larger savings in the experiment sizes, but it is what it is. Happy to consider this one done.
msg8877 (view)	Author: silvan	Date: 2019-06-12.11:03:38
Results: https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue744-v1-opt-30min.html https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue744-v1-sat-30min.html (ignore the parser errors; I re-parsed) Size reports: https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue744-v1-opt-30min-normal.html https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue744-v1-opt-30min-silent.html https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue744-v1-sat-30min-normal.html https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue744-v1-sat-30min-silent.html In particular in the sat case, the largest log files got reduced in size by roughly a factor of 10. Size of directories (only the run dirs): [sieverss@flsrv40 issue744-v1-opt-30min]$ du -sch runs-* 702M runs-normal 528M runs-silent 1.3G total [sieverss@flsrv40 issue744-v1-sat-30min]$ du -sch runs-* 712M runs-normal 561M runs-silent 1.3G total
msg8866 (view)	Author: malte	Date: 2019-06-07.12:09:41
I would like to see an experiment on this. Experiments are not just to verify that what we did works and makes sense; it's also to quantify the impact. Can we rerun the original experiment with the new code? I'd also be interested in the result of something along the lines of $ du -sch old-experiment-data $ du -sch new-experiment-data so that we have an idea how much disk space this change saves us (which is ultimately the reason why we care about log file size).
msg8862 (view)	Author: silvan	Date: 2019-06-07.11:32:38
Remove TODOs.
msg8861 (view)	Author: silvan	Date: 2019-06-07.11:29:31
Merged this one, thanks everyone. I opened issue921 to discuss how to proceed with reducing output of too verbose heuristics. I'll send an email to the group to advertise the new verbosity option for all future experiments on our grid.
msg8858 (view)	Author: malte	Date: 2019-06-06.21:40:33
Looks good to me. Bitbucket pipelines complain, though, I think because of uncrustify. (Didn't check in detail.)
msg8855 (view)	Author: silvan	Date: 2019-06-06.18:37:10
Thanks! I'll wait for Malte to have another look. We also shouldn't forget to communicate to everyone running experiments on our cluster to use verbosity=silent once this is merged as to reduce disk laod.
msg8853 (view)	Author: jendrik	Date: 2019-06-06.18:01:46
I looked at the individual commits and left some comments there.
msg8852 (view)	Author: silvan	Date: 2019-06-06.17:54:45
I added the verbosity flag directly to SearchEngine as suggested. Not only allowed this to add it to SearchStatistics, too, but also to SearchProgress, which is the exclusive place that prints "New best heuristic value..." using the evaluators. Thus, to control all output that is generated during search, one can now use the verbosity option of search engines. Using verbosity=silent will not print any type of statistics or notion of progress during search, but still print the usual statistics at the end. Both TODOs for future work in the summary are addressed with this. What still remains to do (or to be discussed, at least), is to add a verbosity option for heuristics, too, that would control the amount of output during heuristic precomputations/computations. For merge-and-shrink, this option already exists. I think that someone suggested to also reduce the noise of iPDB, for example. I added this to the summary. This is now ready for another round of review.
msg8840 (view)	Author: silvan	Date: 2019-06-06.13:29:30
I'm done addressing the comments.
msg8839 (view)	Author: silvan	Date: 2019-06-06.13:29:22
Add to summary: - open a follow-up issue for reducing search output of lazy search and enforced hill climbing.
msg8830 (view)	Author: malte	Date: 2019-06-06.11:04:10
> Even if that means to expose an option to, say, lazy greedy search that simply > doesn't do anything? :-) I think we should at least document this. I saw the many identical calls to "add_verbosity_options" and was under the misapprehension that some of these were for different search algorithms. (And I remember that we also want to reduce output also for lazy search and EHC, but I see now that the plan is that this will be controlled by the evaluators, not the search algorithms themselves.) It looks like all these identical calls were for different eager plug-ins. In this case, the suggestion is to refactor this common code so that the different eager plug-ins don't violate DRY so much.
msg8829 (view)	Author: silvan	Date: 2019-06-06.09:56:14
Even if that means to expose an option to, say, lazy greedy search that simply doesn't do anything? :-) I think we should at least document this.
msg8827 (view)	Author: jendrik	Date: 2019-06-06.07:52:34
I think having the option available for all search engines makes sense.
msg8825 (view)	Author: silvan	Date: 2019-06-05.23:44:26
I wondered the same while implementing, but thought that there wouldn't be much value in having the option for all search engines as long as we only "support" (aka. use) it in eager search. We can still change it, though, if you prefer.
msg8817 (view)	Author: malte	Date: 2019-06-05.19:27:38
I left a few comments on bitbucket and like the patch in general. In many places, we have the combination SearchEngine::add_options_to_parser(parser); + utils::add_verbosity_options_to_parser(parser); in the diff. I wonder if verbosity should not be a general search engine option, so that it would only need to be added in SearchEngine::add_options_to_parser. We discussed offline that we're not necessarily committing to this design in the long term (as I recall Jendrik favoured explicitly using loggers and using different names for the verbosity levels; I also think something more logger-based is the better long-term solution). But for now we want to address the immediate problem of too much output without having to solve the larger question of a sane logging strategy.
msg8815 (view)	Author: silvan	Date: 2019-06-05.18:00:51
I moved the Verbosity enum from M&S to logging and added a verbosity option for eager search, disabling output of search progress statistics if verbosity=silent. Pull request: https://bitbucket.org/SilvanS/fd-dev/pull-requests/51/issue744/diff
msg8812 (view)	Author: silvan	Date: 2019-06-05.15:31:32
We discussed offline to defer addressing too much output of the form "New best heuristic..." to a future issue. In this issue, we will introduce a configurable verbosity level to reduce output related to f-value progress.
msg8811 (view)	Author: silvan	Date: 2019-06-05.15:08:12
Looking at the output of some parcprinter problems, we realized that it depends on the used heuristic which line is output most frequently: using A* with blind, we get many lines of the form f = 438047 [1 evaluated, 0 expanded, t=0.00396825s, 25372 KB whereas using lmcut, we mostly get lines New best heuristic value for lmcut: 430047 [g=8000, 4 evaluated, 2 expanded, t=0.00906805s, 25372 KB] So it is not directly apparent that limiting the amount of printed lines "f = ..." directly resolves this problem. Furthermore, we also started looking into the code and found that the different output lines stem from different parts of the code. The "f = ..." and "[g=...]" lines stem directly from eager search, but "New best heuristic value..." is triggered in the search progress (indirectly, via a method called check_progress that prints this line as a side effect, but only for heuristics, not all evaluators). It is not directly clear if passing a parameter for limiting output to an evaluator would be a clean solution, if we wanted to limit output of the form "New best heuristic value..." and not only the f-value statistics as originally assumed.
msg8801 (view)	Author: malte	Date: 2019-06-04.17:53:12
Sounds good, let's discuss it further offline! It may be useful to include Silvan in the discuss if he is available because I think he already included verboseness settings in some parts of the planner.
msg8799 (view)	Author: manuel	Date: 2019-06-04.17:34:17
I analyzed the log files and discovered some space dominating parts of the log files. Optimal planning (most significant issue first; size in number of characters reflects the size of the largest log file where the issue arises): - f-values (size: 10'485'863) - ipdb improvement steps (size: 2'842'043) - mas steps (21 lines per step; size: 660'836) - cegar refinement steps (14 lines per step; size: 318'891) - divpot sampling steps (3 lines per step; size: 125'821) Satisficing planning (both issues are equally significant): - h-values - plan From my opinion, it is the right step to first limit the output of f-value and h-value statistics. Moreover, I recommend to also provide options for reducing the outputs of heuristics. I think the next step is to discuss possible changes offline. Data: https://ai.dmi.unibas.ch/_tmp_files/heusner/issue744-base-opt-30min-sorted.html https://ai.dmi.unibas.ch/_tmp_files/heusner/issue744-base-sat-30min-sorted.html
msg6614 (view)	Author: jendrik	Date: 2017-11-27.19:21:13
Once the statistics are separated from the actual search code more clearly, we would like to make statistics configurable. One parameter should allow limiting the maximum amount of f-value outputs.
msg6600 (view)	Author: jendrik	Date: 2017-11-15.09:06:12
Some parcprinter tasks produce huge logfiles, sometimes as large as 30 MiB (see http://ai.cs.unibas.ch/_tmp_files/seipp/parcprinter-opt11-strips-p10.run.log.xz). Almost all of the output is due to f-value statistics. We should probably make logging f-values less verbose for these tasks.

History
Date	User	Action	Args
2019-06-12 17:04:19	silvan	set	messages: + msg8886
2019-06-12 16:07:52	malte	set	messages: + msg8885
2019-06-12 11:03:38	silvan	set	messages: + msg8877
2019-06-07 12:09:41	malte	set	messages: + msg8866
2019-06-07 11:32:38	silvan	set	messages: + msg8862 summary: part of issue746 TODO: - discuss/open an issue for adding support for verbosity levels for all heuristics. - send email to group/students to tell them to use verbosity=silent -> part of issue746
2019-06-07 11:29:31	silvan	set	status: reviewing -> resolved messages: + msg8861
2019-06-06 21:40:33	malte	set	messages: + msg8858
2019-06-06 18:37:10	silvan	set	messages: + msg8855 summary: part of issue746 TODO: - discuss/open an issue for adding support for verbosity levels for all heuristics. -> part of issue746 TODO: - discuss/open an issue for adding support for verbosity levels for all heuristics. - send email to group/students to tell them to use verbosity=silent
2019-06-06 18:01:46	jendrik	set	messages: + msg8853
2019-06-06 17:54:45	silvan	set	messages: + msg8852 summary: part of issue746 TODO: - open a new issue to deal with too much output of the form "New best heuristics...". One idea is to set the flag "use_for_reporting_minima" of heuristics depending on the chosen verbosity level of the heuristic (to be added option). - open a follow-up issue for reducing search output of lazy search and enforced hill climbing. -> part of issue746 TODO: - discuss/open an issue for adding support for verbosity levels for all heuristics.
2019-06-06 13:29:30	silvan	set	messages: + msg8840
2019-06-06 13:29:22	silvan	set	messages: + msg8839 summary: part of issue746 TODO: - open a new issue to deal with too much output of the form "New best heuristics...". One idea is to set the flag "use_for_reporting_minima" of heuristics depending on the chosen verbosity level of the heuristic (to be added option). -> part of issue746 TODO: - open a new issue to deal with too much output of the form "New best heuristics...". One idea is to set the flag "use_for_reporting_minima" of heuristics depending on the chosen verbosity level of the heuristic (to be added option). - open a follow-up issue for reducing search output of lazy search and enforced hill climbing.
2019-06-06 11:04:10	malte	set	messages: + msg8830
2019-06-06 09:56:14	silvan	set	messages: + msg8829
2019-06-06 07:52:34	jendrik	set	messages: + msg8827
2019-06-05 23:44:26	silvan	set	messages: + msg8825
2019-06-05 19:27:38	malte	set	messages: + msg8817
2019-06-05 18:00:51	silvan	set	status: chatting -> reviewing assignedto: silvan messages: + msg8815
2019-06-05 15:31:32	silvan	set	messages: + msg8812 summary: part of issue746 -> part of issue746 TODO: - open a new issue to deal with too much output of the form "New best heuristics...". One idea is to set the flag "use_for_reporting_minima" of heuristics depending on the chosen verbosity level of the heuristic (to be added option).
2019-06-05 15:08:12	silvan	set	nosy: + silvan messages: + msg8811
2019-06-04 17:53:12	malte	set	messages: + msg8801
2019-06-04 17:34:17	manuel	set	nosy: + manuel messages: + msg8799
2017-11-27 19:21:13	jendrik	set	status: unread -> chatting messages: + msg6614 summary: part of issue746
2017-11-15 09:06:12	jendrik	create

Issue744