Issue 781: make stubborn sets computation more efficient

Title	make stubborn sets computation more efficient
Priority	wish	Status	resolved
Superseder		Nosy List	gabi, jendrik, malte, silvan
Assigned To	jendrik	Keywords
Optional summary

Created on 2018-04-24.17:44:25 by jendrik, last changed by jendrik.

Messages
msg9326 (view)	Author: jendrik	Date: 2020-05-12.16:29:55
> Why not make the pages optional in the two relevant formatting functions, so that the existing call works? Done. It could be that the docs will only be built after the other buildbot steps succeed again. (I think the g_log change needs some adaptations for clang.)
msg9325 (view)	Author: malte	Date: 2020-05-12.16:02:22
> I added some placeholder for the missing page numbers. I don't see a change online yet, but something along the lines of "pp. placeholder" in the public docs appears a poor solution to me. Why not make the pages optional in the two relevant formatting functions, so that the existing call works? > I'd be in favor to use lower case for all enum options in the docs to unify > the situation. An imperfect grep with pattern '"[A-Z][A-Z]' and the C locale finds upper-case enum options in: src/search/cegar/additive_cartesian_heuristic.cc src/search/cegar/subtask_generators.cc src/search/lp/lp_solver.cc src/search/merge_and_shrink/label_reduction.cc src/search/merge_and_shrink/merge_tree_factory_linear.cc src/search/merge_and_shrink/shrink_bisimulation.cc src/search/merge_and_shrink/shrink_fh.cc src/search/operator_cost.cc (I did not try searching for lower-case options instead, this cannot be done so easily.)
msg9324 (view)	Author: silvan	Date: 2020-05-12.15:57:43
> I'd be in favor to use lower case for all enum options in the docs to unify the situation. I agree. > I opened issue966 for cleaning up the pruning module. Thanks.
msg9323 (view)	Author: jendrik	Date: 2020-05-12.15:55:00
I opened issue966 for cleaning up the pruning module.
msg9321 (view)	Author: jendrik	Date: 2020-05-12.15:53:06
I added some placeholder for the missing page numbers. I'd be in favor to use lower case for all enum options in the docs to unify the situation.
msg9320 (view)	Author: malte	Date: 2020-05-12.15:36:27
Something about the page numbers and possibly publisher of the new reference is broken on the pruning method doc page. Can you fix? Doesn't require an issue I would say. Comparing doc pages, I also saw that other doc pages use ALL_CAPS for enum items. Can/should we change this here, or is this currently all mixed anyway? Only affects the doc pages, the option parser ignores case for these arguments.
msg9318 (view)	Author: silvan	Date: 2020-05-12.15:27:42
Great, thanks! Do you create a follow-up issue for refactoring module? I really think that the current state it not ideal. (I haven't read all of your discussion on bitbucket, but I already saw some comments of the form "this should be done in the next issue".)
msg9311 (view)	Author: jendrik	Date: 2020-05-12.14:51:53
Perfect! Merged.
msg9309 (view)	Author: malte	Date: 2020-05-12.14:46:06
Ready to merge from my side.
msg9308 (view)	Author: jendrik	Date: 2020-05-12.14:45:47
Done handling them :-)
msg9307 (view)	Author: malte	Date: 2020-05-12.14:10:49
I've made a few more (small) comments.
msg9306 (view)	Author: jendrik	Date: 2020-05-12.13:30:42
Thanks for the comments! And yes, I also got the comments for the individual commits. I updated the patch accordingly. Can I merge the branch?
msg9303 (view)	Author: malte	Date: 2020-05-12.12:52:15
Just a note, some of the comments I made in the last few minutes were on individual commits, not the pull request. Hope you got them anyway.
msg9298 (view)	Author: malte	Date: 2020-05-11.19:32:35
I did a code review. I left quite a few comments, but only because the code is already so nice that it satisfies the criteria for a real code review. ;-) Looks really good!
msg9291 (view)	Author: jendrik	Date: 2020-05-08.16:21:04
I merged in the latest default branch (including the changes from issue962).
msg9270 (view)	Author: jendrik	Date: 2020-05-06.00:15:13
Thanks for your comments on Bitbucket, Silvan! I took care of them. Malte, if you like, you could have a look at the code now: https://bitbucket.org/jendrikseipp/downward/pull-requests/84
msg9266 (view)	Author: silvan	Date: 2020-05-05.18:03:14
Fine with me, but I'd suggest to directly continue working on the interface then, because the current integration of a fixed point computation within another one is really not ideal.
msg9265 (view)	Author: jendrik	Date: 2020-05-04.13:07:54
I agree that the interface of the stubborn sets classes is not ideal. If we start changing it in this issue, however, the code will become difficult to review. I suggest to merge the code for the atom-centric version, and then open a new issue that cleans up the interface.
msg9264 (view)	Author: malte	Date: 2020-05-04.12:30:00
Hi Silvan, all code changes and renames of classes and methods are of course fine. The thing that needs more care is changing the command-line option names of the old pruning methods.
msg9263 (view)	Author: silvan	Date: 2020-05-04.12:25:17
I also think we could keep all features, so I'm with this. Renaming also sounds like a good idea even if it breaks some previous code. More generally, I just left a comment on bitbucket related refactoring the pruning module. There are several TODOs in the interface of the stubborn sets base class, and our new code does not really fit this interface anyway (we perform our fixpoint iteration within one iteration of the fixpoint iteration of the base class and we don't use the data structures of the base class). So I'd suggest to see if we can change the interface of the stubborn sets base class so that all three variants can make use of it. If not, then we might consider having an intermediate layer, called OperatorCentric..., from which the two existing classes inherit. We should also consider if we can make the atom selection strategies available to all different stubborn set algorithms.
msg9262 (view)	Author: jendrik	Date: 2020-05-04.11:52:57
Renaming the plugins and using temporary aliases sounds like a good idea. But it also sounds like a long discussion :-) For now, I just renamed the new plugin to atom_centric_stubborn_sets().
msg9261 (view)	Author: malte	Date: 2020-05-04.11:07:49
We currently have consistent naming conventions merge_... and shrink_... for merge and shrink strategies and ..._constraints for constraints. (Unfortunately not the same one...) We currently also have the consistent stubborn_sets_..., which is not a great convention; prune_... or pruning_... would probably be better and would also allow us to use prune_null(). But then we're talking about renaming again. I guess I don't feel very strongly about this. Perhaps a great renaming would be useful at some point, possibly together with introducing aliases for some time to avoid breaking stuff. If we rename things, I agree a separate issue would be good. That would also make more sense w.r.t. what the change log looks like. (To be clear, I'm not so strongly against renaming that you shouldn't do it if you prefer.)
msg9260 (view)	Author: jendrik	Date: 2020-05-04.11:01:40
I agree that breaking user code is not good. Should the new plugin still be called "atom_centric_stubborn_sets" (instead of "stubborn_sets_atom_centric")? Do we still want to rename the StubbornSetSimple class (not the plugin)?
msg9259 (view)	Author: malte	Date: 2020-05-04.10:53:07
PS: If we change plugin names, please mention this prominently in the change notes.
msg9258 (view)	Author: malte	Date: 2020-05-04.10:52:42
I am not such a fan of changing plugin names because it breaks existing usage (also in our sample experiments etc.), every user has to change their code, and the benefit is limited. I think we have some other poor names too and haven't been too bothered (e.g. hmax vs. add is not consistent and "sum" vs. "add" is confusing).
msg9257 (view)	Author: gabi	Date: 2020-05-04.10:30:11
I am clearly in favor of variant "operator_centric_stubborn_sets".
msg9256 (view)	Author: jendrik	Date: 2020-05-04.10:04:29
I agree that stubborn_sets_simple should become stubborn_sets_operator_centric. What do you think about changing the plugin names (the names used in commandline configs) from "stubborn_sets_operator_centric" etc. to "operator_centric_stubborn_sets"? I'd do the renaming of other stubborn sets variants in a different issue or in the main branch.
msg9255 (view)	Author: gabi	Date: 2020-05-04.09:47:02
I don't think we must integrate all features from the paper into the standard code base but since in this case it doesn't add too much clutter to the code, I would be fine with the variant in Jendrik's pull request. I wonder whether we should rename some of the existing parts (stubborn_sets_simple and stubborn_sets_ec). I think ec can stay as it is but simple should maybe become stubborn_sets_operator_centric (analogous to ..._atom_centric). What do you think?
msg9253 (view)	Author: jendrik	Date: 2020-05-02.14:12:33
I updated the code according to msg9249. Here is the pull request: https://bitbucket.org/jendrikseipp/downward/pull-requests/84 Gabi, what's your opinion on the features? Silvan, could you review the code, please?
msg9250 (view)	Author: malte	Date: 2020-04-23.21:27:26
Thanks for the initiative, Jendrik! I'm always for making the names in the code match the terminology of the papers more closely. The docs for the plugin should cite the SoCS paper using our paper citation feature. I don't necessarily think all variable orderings are needed, but I don't have strong feelings to the opposite either. I guess this is also a question of code size vs. code utility. What do the others think? I can do a quick final code review once the others are happy to see this merged.
msg9249 (view)	Author: jendrik	Date: 2020-04-23.21:21:46
For context: we turned this issue into a SoCS 2020 paper. I think it would be good to integrate the code now that things are still fresh on our minds. Therefore, I updated the pull request by merging from default, stripping the experimental commits and merged the code version from our paper. That version repeated the code for selecting facts once for preconditions and once for the goal facts, so I factored out the common code into a separate method. Before I polish the code further, we should decide whether we want to keep all implemented features. I think it makes sense to keep the code for marking variables (the "siblings" feature) and I would also keep the option for turning this on and off. In my opinion the feature could be "on" by default. I don't think we necessarily need to keep all variable orderings, but they also don't need a huge amount of code, so I'm indifferent about whether to keep them. I would definitely keep "quick skip" and make it the default. Finally, we should think about whether we want to revise any names. I propose the following changes: StubbornSetsQueue -> StubbornSetsAtomCentric (for class, file and plugin names) mark_variables -> use_sibling_shortcut variable_ordering -> atom_selection_strategy minimize_ss -> quick_skip What are your thoughts on the features, defaults and names?
msg7102 (view)	Author: jendrik	Date: 2018-04-30.09:33:53
Regarding the stubborn sets: Yes, I'm also interested in working on this further. Regarding the tracking tools: I use uBlock Origin, Privacy Badger and HTTPS Everywhere and can recommend all of them.
msg7101 (view)	Author: silvan	Date: 2018-04-29.12:42:55
> Does anyone have any recommendations for any of the tools they discuss, or > others? I only use Ad Block Plus and Ghostery. The latter one mainly because you and/or others recommended it. According to your link, "Disconnect" is their preferred alternative. > Anyone interested in that? I'd be interested and would definitely also need to re-read some of the older papers before contributing to any further discussion.
msg7100 (view)	Author: malte	Date: 2018-04-29.12:34:20
Back to the issue: the experimental results for all three changes look good, which is nice and somewhat unexpected. (My preliminary tests for the "opportunistic NES" idea didn't look encouraging.) There is one further change that I would like to try, to do with a more selective definition of interference than what we currently use. But that needs further thought and some study of one of our older papers, and I'm not sure if it can be combined with the use of weak stubborn sets. More generally, I'm not sure how to best proceed. We now have a prototype implementation, but I think this needs some further thought at the conceptual level to come up with something clean and convincing. I think we should set aside half a day or so to re-read the relevant papers, think about what we want to do at the high-level, and then think how to realize the idea at the data structure level. Anyone interested in that?
msg7099 (view)	Author: malte	Date: 2018-04-29.12:23:22
With Ghostery enabled but set to trust the site, I can render the experiment page in ~4 minutes if I do nothing else at the same time, or ~13 minutes if I simultaneously browse in other tabs. Without Ghostery, the page renders immediately (a few seconds max). Ditto with Ghostery enabled, but loading from a file (as Ghostery only affects http and https URLs, not file URLs). Pausing Ghostery doesn't help. It has to be fully disabled in the browser extension menu. I've looked at alternatives to Ghostery here: https://lifehacker.com/the-best-browser-extensions-that-protect-your-privacy-479408034. Does anyone have any recommendations for any of the tools they discuss, or others? (I'm more interested in the privacy/no-tracking features than ad blocking, so the second category on the page.) I guess filing a bug report with Ghostery could also help.
msg7098 (view)	Author: malte	Date: 2018-04-29.11:43:05
Yes, it seems to be related to Ghostery. White-listing ai.cs.unibas.ch seems to help somewhat, but page load times are still not good. Disabling Ghostery completely seems to address the issue, but is not really an option. I will investigate a bit more.
msg7097 (view)	Author: jendrik	Date: 2018-04-29.11:42:54
The comparison report opens without any issues in both Chrome and Firefox for me.
msg7096 (view)	Author: jendrik	Date: 2018-04-29.11:40:46
I don't think anything changed regarding the reports in Downward Lab.
msg7095 (view)	Author: silvan	Date: 2018-04-29.11:31:50
I have the same issue. For me, though, the browser says that some script (Ghostery addon?) may run forever. Loading the reports from a local file works indeed, or alternatively using google chrome.
msg7094 (view)	Author: malte	Date: 2018-04-29.11:29:57
Did anything change with the reports over the last week, e.g. some kind of change in downward-lab? I can no longer open them on Firefox over the net (Ubuntu 16.04), they load forever and crank the CPU usage up to 100% for each tab. Downloading them with wget and opening them from the file still works, so it looks like it is some interaction between the partial loading of the page off the web and the rendering.
msg7092 (view)	Author: jendrik	Date: 2018-04-29.09:48:15
Here are the experimental results for the different extensions: http://ai.cs.unibas.ch/_tmp_files/seipp/issue781-v4-extensions-comparison.html http://ai.cs.unibas.ch/_tmp_files/seipp/issue781-v4-extensions.html I fixed measuring the pruning time. The bug only affected configurations using min_required_pruning_ratio > 0. In all previous experiments, we should therefore only take pruning times of configurations with min_required_pruning_ratio=0 into account.
msg7091 (view)	Author: malte	Date: 2018-04-28.21:00:51
Hi Jendik, I've tried out three small ideas (pushed to your repository; I hope you don't mind -- feel free to strip later). The three ideas all branch off from issue781-v3, so they create a number of new heads. I've used one tag for each, but perhaps I should have used branches instead. The tags are: issue781-v4-wss: use weak stubborn sets instead of strong stubborn sets issue781-v4-varmark: add variable-based marking to fact-based marking issue781-v4-opportunistic: select NES to opportunistically minimize stubborn set The first of these should be a small performance improvement across the board. The second could also offer a performance improvement, although it can also hurt on occasion, and the effect likely varies more by domain. The third will likely make blind search slower, but might help a bit with more expensive heuristics. Can you set up an experiment for these? I'd like to see the three new tags compared to issue781-v3 using the blind heuristic and LM-Cut, so 8 configurations total. (I would only use the new stubborn set algorithm, with min pruning disabled.) If useful, all of these can be polished further. If the third one is not useful, there may be a way of implementing the same functionality more efficiently.
msg7090 (view)	Author: malte	Date: 2018-04-28.12:48:27
(That's for the LM-Cut heuristic.)
msg7089 (view)	Author: malte	Date: 2018-04-28.12:45:36
Great, thanks! Thanks also for adding the time for pruning, but it looks wrong for the configurations with a min pruning ratio. For example, for the new algorithm, the total time for pruning is 2178.89 seconds without a min pruning ratio, but 11282.52 seconds when using the min pruning ratio, which doesn't make sense. (The second number should be lower.)
msg7088 (view)	Author: jendrik	Date: 2018-04-28.10:40:31
Good catch! I fixed this and reran the two configs. Here is the updated report: http://ai.cs.unibas.ch/_tmp_files/seipp/issue781-v2-v3-combined.html
msg7087 (view)	Author: malte	Date: 2018-04-27.20:47:45
Thanks! Looks like the min pruning code does not work for the new stubborn set method yet, and indeed there is a code comment in the parse code that suggests this. Can we change this and rerun the two affected configurations? Given how generic the min pruning feature is, I think it should really work out of the box for a new pruning method without the derived class having to worry about it (other than adding the options). That it doesn't perhaps means that the division betwen the base class responsibilities and the derived class responsibilities are not designed well enough. It would probably make sense to apply the template method pattern to make sure the base class code we always want is always executed. (I already thought the same thing a few days ago because of the code comment that says that derived code can assert that the current state is not a goal state -- this should really not be the derived classes' job.)
msg7086 (view)	Author: jendrik	Date: 2018-04-27.10:37:51
Here are the results for the experiment you requested: http://ai.cs.unibas.ch/_tmp_files/seipp/issue781-v2.html
msg7083 (view)	Author: malte	Date: 2018-04-26.10:59:46
I had a brief look at the code, and if we were about to merge this I'd have a few comments on minor optimizations etc., but I suggest we wait with that until it's clearer what we want to do. There are still quite a few things we might want to try out.
msg7082 (view)	Author: malte	Date: 2018-04-26.10:54:25
Looks good! I think there is more optimization potential, but before we go down this route, I think it would be useful to add a timer to the code that measures and reports the time spent on stubborn set computation. It would be useful to have this for the other methods, too, at least within the branch. I suppose it could be added to the base class? (Whether we want to merge that is a separate question, but I think it would be useful for this issue.) The new code is supposed to be semantically equivalent to the simple stubborn sets, and it looks like the experiments bear this out in terms of expansions etc., although I didn't check all the numbers. Can you also run experiments with LM-Cut, and perhaps experiments with the min-pruning bound enabled? So 12 configurations in total: three stubborn set methods, two heuristics, with and without the pruning bound. Going forward, I think we don't need all these comparisons all the time (e.g. I think another comparison to the expansion core variant only becomes interesting once we make changes that affect what is pruned rather than how quickly the computation is done), but I think it would be good to have all the data once, including how much time the stubborn set computations take.
msg7081 (view)	Author: jendrik	Date: 2018-04-26.07:24:34
Here are the results: http://ai.cs.unibas.ch/_tmp_files/seipp/issue781-v1-blind.html The new version solves more tasks than both older versions in many domains.
msg7080 (view)	Author: jendrik	Date: 2018-04-25.23:43:55
I implemented the pseudo-code: https://bitbucket.org/jendrikseipp/downward/pull-requests/84 An experiment comparing the three stubborn sets variants (using the blind heuristic) is running.
msg7075 (view)	Author: malte	Date: 2018-04-24.22:34:41
I thought about this a bit and will send you some pseudo-code by email. I suggest to first implement this as a new pruning method rather than modifying the existing code (but perhaps starting by copying some code). If results are good, we can think about whether we want all versions later. But it may take a few iterations to get there.
msg7073 (view)	Author: jendrik	Date: 2018-04-24.17:44:25
Quoting Malte in issue778 (msg7072): [...] The main problem with the disabling/interference data structures is that they can be so large, O(N^2) in the number of operators. Only computing the information on demand can help a bit with that, but only on problems where a large portion of operators never becomes applicable. I think it would be better to look into avoiding the O(N^2) data structure altogether. In other words, I think this is a case where we should be thinking about the algorithm rather than polishing the implementation details. I think that by thinking about this a bit more deeply, we might be able to come up with a better idea at the algorithm level, perhaps ending up with a much more substantial improvement. I feel somewhat similarly about issue773: I'm not sure if that change would have been necessary if we had addressed things more directly and made the stubborn set computations more efficient. If someone is interested in looking into this with me, I suggest we set aside half an hour to look at what is actually computed and whether the same information could not be computed more intelligently. I am 80% confident that we can make the stubborn sets substantially faster and also get rid of the worst-case O(\|Actions\|^2) data structures. [...]

History
Date	User	Action	Args
2020-05-12 16:29:55	jendrik	set	messages: + msg9326
2020-05-12 16:02:22	malte	set	messages: + msg9325
2020-05-12 15:57:43	silvan	set	messages: + msg9324
2020-05-12 15:55:00	jendrik	set	messages: + msg9323
2020-05-12 15:53:06	jendrik	set	messages: + msg9321
2020-05-12 15:36:27	malte	set	messages: + msg9320
2020-05-12 15:27:42	silvan	set	messages: + msg9318
2020-05-12 14:51:53	jendrik	set	status: reviewing -> resolved messages: + msg9311
2020-05-12 14:46:06	malte	set	messages: + msg9309
2020-05-12 14:45:47	jendrik	set	messages: + msg9308
2020-05-12 14:10:49	malte	set	messages: + msg9307
2020-05-12 13:30:42	jendrik	set	messages: + msg9306
2020-05-12 12:52:15	malte	set	messages: + msg9303
2020-05-11 19:32:35	malte	set	messages: + msg9298
2020-05-08 16:21:04	jendrik	set	messages: + msg9291
2020-05-06 00:15:13	jendrik	set	messages: + msg9270
2020-05-05 18:03:14	silvan	set	messages: + msg9266
2020-05-04 13:07:54	jendrik	set	messages: + msg9265
2020-05-04 12:30:00	malte	set	messages: + msg9264
2020-05-04 12:25:17	silvan	set	messages: + msg9263
2020-05-04 11:52:57	jendrik	set	messages: + msg9262
2020-05-04 11:07:50	malte	set	messages: + msg9261
2020-05-04 11:01:40	jendrik	set	messages: + msg9260
2020-05-04 10:53:07	malte	set	messages: + msg9259
2020-05-04 10:52:42	malte	set	messages: + msg9258
2020-05-04 10:30:11	gabi	set	messages: + msg9257
2020-05-04 10:04:29	jendrik	set	messages: + msg9256
2020-05-04 09:47:03	gabi	set	messages: + msg9255
2020-05-02 14:12:33	jendrik	set	status: in-progress -> reviewing messages: + msg9253
2020-04-23 21:27:26	malte	set	messages: + msg9250
2020-04-23 21:21:46	jendrik	set	nosy: + gabi messages: + msg9249
2018-04-30 09:33:53	jendrik	set	messages: + msg7102
2018-04-29 12:42:55	silvan	set	messages: + msg7101
2018-04-29 12:34:20	malte	set	messages: + msg7100
2018-04-29 12:23:22	malte	set	messages: + msg7099
2018-04-29 11:43:05	malte	set	messages: + msg7098
2018-04-29 11:42:54	jendrik	set	messages: + msg7097
2018-04-29 11:40:46	jendrik	set	messages: + msg7096
2018-04-29 11:31:50	silvan	set	messages: + msg7095
2018-04-29 11:29:57	malte	set	messages: + msg7094
2018-04-29 09:48:16	jendrik	set	messages: + msg7092
2018-04-28 21:00:51	malte	set	messages: + msg7091
2018-04-28 12:48:27	malte	set	messages: + msg7090
2018-04-28 12:45:36	malte	set	messages: + msg7089
2018-04-28 10:40:31	jendrik	set	messages: + msg7088
2018-04-27 20:47:45	malte	set	messages: + msg7087
2018-04-27 10:37:51	jendrik	set	messages: + msg7086
2018-04-26 10:59:46	malte	set	messages: + msg7083
2018-04-26 10:54:25	malte	set	messages: + msg7082
2018-04-26 07:24:34	jendrik	set	messages: + msg7081
2018-04-25 23:43:55	jendrik	set	status: chatting -> in-progress assignedto: jendrik messages: + msg7080
2018-04-25 11:00:50	silvan	set	nosy: + silvan
2018-04-24 22:34:41	malte	set	status: unread -> chatting messages: + msg7075
2018-04-24 17:44:25	jendrik	create

Issue781