Issue781

Title make stubborn sets computation more efficient
Priority wish Status resolved
Superseder Nosy List gabi, jendrik, malte, silvan
Assigned To jendrik Keywords
Optional summary

Created on 2018-04-24.17:44:25 by jendrik, last changed by jendrik.

Messages
msg9326 (view) Author: jendrik Date: 2020-05-12.16:29:55
> Why not make the pages optional in the two relevant formatting functions, so that the existing call works?

Done. It could be that the docs will only be built after the other buildbot steps succeed again. (I think the g_log change needs some adaptations for clang.)
msg9325 (view) Author: malte Date: 2020-05-12.16:02:22
> I added some placeholder for the missing page numbers.

I don't see a change online yet, but something along the lines of "pp. placeholder" in the public docs appears a poor solution to me. Why not make the pages optional in the two relevant formatting functions, so that the existing call works?

> I'd be in favor to use lower case for all enum options in the docs to unify
> the situation.

An imperfect grep with pattern '"[A-Z][A-Z]' and the C locale finds upper-case enum options in:

src/search/cegar/additive_cartesian_heuristic.cc
src/search/cegar/subtask_generators.cc
src/search/lp/lp_solver.cc
src/search/merge_and_shrink/label_reduction.cc
src/search/merge_and_shrink/merge_tree_factory_linear.cc
src/search/merge_and_shrink/shrink_bisimulation.cc
src/search/merge_and_shrink/shrink_fh.cc
src/search/operator_cost.cc

(I did not try searching for lower-case options instead, this cannot be done so easily.)
msg9324 (view) Author: silvan Date: 2020-05-12.15:57:43
> I'd be in favor to use lower case for all enum options in the docs to unify the situation.
I agree.

> I opened issue966 for cleaning up the pruning module.
Thanks.
msg9323 (view) Author: jendrik Date: 2020-05-12.15:55:00
I opened issue966 for cleaning up the pruning module.
msg9321 (view) Author: jendrik Date: 2020-05-12.15:53:06
I added some placeholder for the missing page numbers.

I'd be in favor to use lower case for all enum options in the docs to unify the situation.
msg9320 (view) Author: malte Date: 2020-05-12.15:36:27
Something about the page numbers and possibly publisher of the new reference is broken on the pruning method doc page. Can you fix? Doesn't require an issue I would say.

Comparing doc pages, I also saw that other doc pages use ALL_CAPS for enum items. Can/should we change this here, or is this currently all mixed anyway? Only affects the doc pages, the option parser ignores case for these arguments.
msg9318 (view) Author: silvan Date: 2020-05-12.15:27:42
Great, thanks! 

Do you create a follow-up issue for refactoring module? I really think that the current state it not ideal. (I haven't read all of your discussion on bitbucket, but I already saw some comments of the form "this should be done in the next issue".)
msg9311 (view) Author: jendrik Date: 2020-05-12.14:51:53
Perfect! Merged.
msg9309 (view) Author: malte Date: 2020-05-12.14:46:06
Ready to merge from my side.
msg9308 (view) Author: jendrik Date: 2020-05-12.14:45:47
Done handling them :-)
msg9307 (view) Author: malte Date: 2020-05-12.14:10:49
I've made a few more (small) comments.
msg9306 (view) Author: jendrik Date: 2020-05-12.13:30:42
Thanks for the comments! And yes, I also got the comments for the individual commits. I updated the patch accordingly. Can I merge the branch?
msg9303 (view) Author: malte Date: 2020-05-12.12:52:15
Just a note, some of the comments I made in the last few minutes were on individual commits, not the pull request. Hope you got them anyway.
msg9298 (view) Author: malte Date: 2020-05-11.19:32:35
I did a code review. I left quite a few comments, but only because the code is already so nice that it satisfies the criteria for a real code review. ;-) Looks really good!
msg9291 (view) Author: jendrik Date: 2020-05-08.16:21:04
I merged in the latest default branch (including the changes from issue962).
msg9270 (view) Author: jendrik Date: 2020-05-06.00:15:13
Thanks for your comments on Bitbucket, Silvan! I took care of them. 

Malte, if you like, you could have a look at the code now:
https://bitbucket.org/jendrikseipp/downward/pull-requests/84
msg9266 (view) Author: silvan Date: 2020-05-05.18:03:14
Fine with me, but I'd suggest to directly continue working on the interface then, because the current integration of a fixed point computation within another one is really not ideal.
msg9265 (view) Author: jendrik Date: 2020-05-04.13:07:54
I agree that the interface of the stubborn sets classes is not ideal. If we start changing it in this issue, however, the code will become difficult to review. I suggest to merge the code for the atom-centric version, and then open a new issue that cleans up the interface.
msg9264 (view) Author: malte Date: 2020-05-04.12:30:00
Hi Silvan, all *code* changes and renames of classes and methods are of course fine. The thing that needs more care is changing the command-line option names of the old pruning methods.
msg9263 (view) Author: silvan Date: 2020-05-04.12:25:17
I also think we could keep all features, so I'm with this. Renaming also sounds like a good idea even if it breaks some previous code. More generally, I just left a comment on bitbucket related refactoring the pruning module. There are several TODOs in the interface of the stubborn sets base class, and our new code does not really fit this interface anyway (we perform our fixpoint iteration within one iteration of the fixpoint iteration of the base class and we don't use the data structures of the base class). So I'd suggest to see if we can change the interface of the stubborn sets base class so that all three variants can make use of it. If not, then we might consider having an intermediate layer, called OperatorCentric..., from which the two existing classes inherit. We should also consider if we can make the atom selection strategies available to all different stubborn set algorithms.
msg9262 (view) Author: jendrik Date: 2020-05-04.11:52:57
Renaming the plugins and using temporary aliases sounds like a good idea. But it also sounds like a long discussion :-)

For now, I just renamed the new plugin to atom_centric_stubborn_sets().
msg9261 (view) Author: malte Date: 2020-05-04.11:07:49
We currently have consistent naming conventions merge_... and shrink_... for merge and shrink strategies and ..._constraints for constraints. (Unfortunately not the same one...)

We currently also have the consistent stubborn_sets_..., which is not a great convention; prune_... or pruning_... would probably be better and would also allow us to use prune_null(). But then we're talking about renaming again. I guess I don't feel very strongly about this. Perhaps a great renaming would be useful at some point, possibly together with introducing aliases for some time to avoid breaking stuff.

If we rename things, I agree a separate issue would be good. That would also make more sense w.r.t. what the change log looks like.

(To be clear, I'm not so strongly against renaming that you shouldn't do it if you prefer.)
msg9260 (view) Author: jendrik Date: 2020-05-04.11:01:40
I agree that breaking user code is not good. 

Should the new plugin still be called "atom_centric_stubborn_sets" (instead of "stubborn_sets_atom_centric")? Do we still want to rename the StubbornSetSimple class (not the plugin)?
msg9259 (view) Author: malte Date: 2020-05-04.10:53:07
PS: If we change plugin names, please mention this prominently in the change notes.
msg9258 (view) Author: malte Date: 2020-05-04.10:52:42
I am not such a fan of changing plugin names because it breaks existing usage (also in our sample experiments etc.), every user has to change their code, and the benefit is limited. I think we have some other poor names too and haven't been too bothered (e.g. hmax vs. add is not consistent and "sum" vs. "add" is confusing).
msg9257 (view) Author: gabi Date: 2020-05-04.10:30:11
I am clearly in favor of variant "operator_centric_stubborn_sets".
msg9256 (view) Author: jendrik Date: 2020-05-04.10:04:29
I agree that stubborn_sets_simple should become stubborn_sets_operator_centric. 

What do you think about changing the plugin names (the names used in commandline configs) from "stubborn_sets_operator_centric" etc. to "operator_centric_stubborn_sets"?

I'd do the renaming of other stubborn sets variants in a different issue or in the main branch.
msg9255 (view) Author: gabi Date: 2020-05-04.09:47:02
I don't think we *must* integrate all features from the paper into the standard code base but since in this case it doesn't add too much clutter to the code, I would be fine with the variant in Jendrik's pull request.

I wonder whether we should rename some of the existing parts (stubborn_sets_simple and stubborn_sets_ec). I think ec can stay as it is but simple should maybe become stubborn_sets_operator_centric (analogous to ..._atom_centric). What do you think?
msg9253 (view) Author: jendrik Date: 2020-05-02.14:12:33
I updated the code according to msg9249. Here is the pull request:
https://bitbucket.org/jendrikseipp/downward/pull-requests/84

Gabi, what's your opinion on the features? 

Silvan, could you review the code, please?
msg9250 (view) Author: malte Date: 2020-04-23.21:27:26
Thanks for the initiative, Jendrik! I'm always for making the names in the code match the terminology of the papers more closely. The docs for the plugin should cite the SoCS paper using our paper citation feature.

I don't necessarily think all variable orderings are needed, but I don't have strong feelings to the opposite either. I guess this is also a question of code size vs. code utility. What do the others think?

I can do a quick final code review once the others are happy to see this merged.
msg9249 (view) Author: jendrik Date: 2020-04-23.21:21:46
For context: we turned this issue into a SoCS 2020 paper.

I think it would be good to integrate the code now that things are still fresh on our minds. Therefore, I updated the pull request by merging from default, stripping the experimental commits and merged the code version from our paper. That version repeated the code for selecting facts once for preconditions and once for the goal facts, so I factored out the common code into a separate method.

Before I polish the code further, we should decide whether we want to keep all implemented features. I think it makes sense to keep the code for marking variables (the "siblings" feature) and I would also keep the option for turning this on and off. In my opinion the feature could be "on" by default. I don't think we necessarily need to keep all variable orderings, but they also don't need a huge amount of code, so I'm indifferent about whether to keep them. I would definitely keep "quick skip" and make it the default.

Finally, we should think about whether we want to revise any names. I propose the following changes:

StubbornSetsQueue -> StubbornSetsAtomCentric (for class, file and plugin names)
mark_variables -> use_sibling_shortcut
variable_ordering -> atom_selection_strategy
minimize_ss -> quick_skip

What are your thoughts on the features, defaults and names?
msg7102 (view) Author: jendrik Date: 2018-04-30.09:33:53
Regarding the stubborn sets: Yes, I'm also interested in working on this further.

Regarding the tracking tools: I use uBlock Origin, Privacy Badger and HTTPS 
Everywhere and can recommend all of them.
msg7101 (view) Author: silvan Date: 2018-04-29.12:42:55
> Does anyone have any recommendations for any of the tools they discuss, or
> others?
I only use Ad Block Plus and Ghostery. The latter one mainly because you and/or
others recommended it. According to your link, "Disconnect" is their preferred
alternative. 

> Anyone interested in that?
I'd be interested and would definitely also need to re-read some of the older
papers before contributing to any further discussion.
msg7100 (view) Author: malte Date: 2018-04-29.12:34:20
Back to the issue: the experimental results for all three changes look good,
which is nice and somewhat unexpected. (My preliminary tests for the
"opportunistic NES" idea didn't look encouraging.)

There is one further change that I would like to try, to do with a more
selective definition of interference than what we currently use. But that needs
further thought and some study of one of our older papers, and I'm not sure if
it can be combined with the use of weak stubborn sets.

More generally, I'm not sure how to best proceed. We now have a prototype
implementation, but I think this needs some further thought at the conceptual
level to come up with something clean and convincing. I think we should set
aside half a day or so to re-read the relevant papers, think about what we want
to do at the high-level, and then think how to realize the idea at the data
structure level. Anyone interested in that?
msg7099 (view) Author: malte Date: 2018-04-29.12:23:22
With Ghostery enabled but set to trust the site, I can render the experiment
page in ~4 minutes if I do nothing else at the same time, or ~13 minutes if I
simultaneously browse in other tabs. Without Ghostery, the page renders
immediately (a few seconds max). Ditto with Ghostery enabled, but loading from a
file (as Ghostery only affects http and https URLs, not file URLs). Pausing
Ghostery doesn't help. It has to be fully disabled in the browser extension menu. 

I've looked at alternatives to Ghostery here:
https://lifehacker.com/the-best-browser-extensions-that-protect-your-privacy-479408034.
Does anyone have any recommendations for any of the tools they discuss, or
others? (I'm more interested in the privacy/no-tracking features than ad
blocking, so the second category on the page.)

I guess filing a bug report with Ghostery could also help.
msg7098 (view) Author: malte Date: 2018-04-29.11:43:05
Yes, it seems to be related to Ghostery. White-listing ai.cs.unibas.ch seems to
help somewhat, but page load times are still not good. Disabling Ghostery
completely seems to address the issue, but is not really an option. I will
investigate a bit more.
msg7097 (view) Author: jendrik Date: 2018-04-29.11:42:54
The comparison report opens without any issues in both Chrome and Firefox for me.
msg7096 (view) Author: jendrik Date: 2018-04-29.11:40:46
I don't think anything changed regarding the reports in Downward Lab.
msg7095 (view) Author: silvan Date: 2018-04-29.11:31:50
I have the same issue. For me, though, the browser says that some script
(Ghostery addon?) may run forever. Loading the reports from a local file works
indeed, or alternatively using google chrome.
msg7094 (view) Author: malte Date: 2018-04-29.11:29:57
Did anything change with the reports over the last week, e.g. some kind of
change in downward-lab? I can no longer open them on Firefox over the net
(Ubuntu 16.04), they load forever and crank the CPU usage up to 100% for each tab.

Downloading them with wget and opening them from the file still works, so it
looks like it is some interaction between the partial loading of the page off
the web and the rendering.
msg7092 (view) Author: jendrik Date: 2018-04-29.09:48:15
Here are the experimental results for the different extensions:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue781-v4-extensions-comparison.html
http://ai.cs.unibas.ch/_tmp_files/seipp/issue781-v4-extensions.html

I fixed measuring the pruning time. The bug only affected configurations using min_required_pruning_ratio > 0. In all 
previous experiments, we should therefore only take pruning times of configurations with min_required_pruning_ratio=0 into 
account.
msg7091 (view) Author: malte Date: 2018-04-28.21:00:51
Hi Jendik, I've tried out three small ideas (pushed to your repository; I hope
you don't mind -- feel free to strip later). The three ideas all branch off from
issue781-v3, so they create a number of new heads. I've used one tag for each,
but perhaps I should have used branches instead. The tags are:

issue781-v4-wss: use weak stubborn sets instead of strong stubborn sets
issue781-v4-varmark: add variable-based marking to fact-based marking
issue781-v4-opportunistic: select NES to opportunistically minimize stubborn set

The first of these should be a small performance improvement across the board.
The second could also offer a performance improvement, although it can also hurt
on occasion, and the effect likely varies more by domain.
The third will likely make blind search slower, but might help a bit with more
expensive heuristics.

Can you set up an experiment for these? I'd like to see the three new tags
compared to issue781-v3 using the blind heuristic and LM-Cut, so 8
configurations total. (I would only use the new stubborn set algorithm, with min
pruning disabled.) If useful, all of these can be polished further. If the third
one is not useful, there may be a way of implementing the same functionality
more efficiently.
msg7090 (view) Author: malte Date: 2018-04-28.12:48:27
(That's for the LM-Cut heuristic.)
msg7089 (view) Author: malte Date: 2018-04-28.12:45:36
Great, thanks! Thanks also for adding the time for pruning, but it looks wrong
for the configurations with a min pruning ratio. For example, for the new
algorithm, the total time for pruning is 2178.89 seconds without a min pruning
ratio, but 11282.52 seconds when using the min pruning ratio, which doesn't make
sense. (The second number should be lower.)
msg7088 (view) Author: jendrik Date: 2018-04-28.10:40:31
Good catch! I fixed this and reran the two configs. Here is the updated report:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue781-v2-v3-combined.html
msg7087 (view) Author: malte Date: 2018-04-27.20:47:45
Thanks!

Looks like the min pruning code does not work for the new stubborn set method
yet, and indeed there is a code comment in the parse code that suggests this.
Can we change this and rerun the two affected configurations?

Given how generic the min pruning feature is, I think it should really work out
of the box for a new pruning method without the derived class having to worry
about it (other than adding the options). That it doesn't perhaps means that the
division betwen the base class responsibilities and the derived class
responsibilities are not designed well enough. It would probably make sense to
apply the template method pattern to make sure the base class code we always
want is always executed. (I already thought the same thing a few days ago
because of the code comment that says that derived code can assert that the
current state is not a goal state -- this should really not be the derived
classes' job.)
msg7086 (view) Author: jendrik Date: 2018-04-27.10:37:51
Here are the results for the experiment you requested:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue781-v2.html
msg7083 (view) Author: malte Date: 2018-04-26.10:59:46
I had a brief look at the code, and if we were about to merge this I'd have a
few comments on minor optimizations etc., but I suggest we wait with that until
it's clearer what we want to do. There are still quite a few things we might
want to try out.
msg7082 (view) Author: malte Date: 2018-04-26.10:54:25
Looks good!

I think there is more optimization potential, but before we go down this route,
I think it would be useful to add a timer to the code that measures and reports
the time spent on stubborn set computation. It would be useful to have this for
the other methods, too, at least within the branch. I suppose it could be added
to the base class? (Whether we want to merge that is a separate question, but I
think it would be useful for this issue.)

The new code is supposed to be semantically equivalent to the simple stubborn
sets, and it looks like the experiments bear this out in terms of expansions
etc., although I didn't check all the numbers.

Can you also run experiments with LM-Cut, and perhaps experiments with the
min-pruning bound enabled? So 12 configurations in total: three stubborn set
methods, two heuristics, with and without the pruning bound. Going forward, I
think we don't need all these comparisons all the time (e.g. I think another
comparison to the expansion core variant only becomes interesting once we make
changes that affect what is pruned rather than how quickly the computation is
done), but I think it would be good to have all the data once, including how
much time the stubborn set computations take.
msg7081 (view) Author: jendrik Date: 2018-04-26.07:24:34
Here are the results: http://ai.cs.unibas.ch/_tmp_files/seipp/issue781-v1-blind.html

The new version solves more tasks than both older versions in many domains.
msg7080 (view) Author: jendrik Date: 2018-04-25.23:43:55
I implemented the pseudo-code: https://bitbucket.org/jendrikseipp/downward/pull-requests/84

An experiment comparing the three stubborn sets variants (using the blind heuristic) is running.
msg7075 (view) Author: malte Date: 2018-04-24.22:34:41
I thought about this a bit and will send you some pseudo-code by email. I
suggest to first implement this as a new pruning method rather than modifying
the existing code (but perhaps starting by copying some code). If results are
good, we can think about whether we want all versions later. But it may take a
few iterations to get there.
msg7073 (view) Author: jendrik Date: 2018-04-24.17:44:25
Quoting Malte in issue778 (msg7072):

[...] The main
problem with the disabling/interference data structures is that they can be so
large, O(N^2) in the number of operators. Only computing the information on
demand can help a bit with that, but only on problems where a large portion of
operators never becomes applicable. I think it would be better to look into
avoiding the O(N^2) data structure altogether. In other words, I think this is a
case where we should be thinking about the algorithm rather than polishing the
implementation details.

I think that by thinking about this a bit more deeply, we might be able to come
up with a better idea at the algorithm level, perhaps ending up with a much more
substantial improvement. I feel somewhat similarly about issue773: I'm not sure
if that change would have been necessary if we had addressed things more
directly and made the stubborn set computations more efficient.

If someone is interested in looking into this with me, I suggest we set aside
half an hour to look at what is actually computed and whether the same
information could not be computed more intelligently. I am 80% confident that we
can make the stubborn sets substantially faster and also get rid of the
worst-case O(|Actions|^2) data structures. [...]
History
Date User Action Args
2020-05-12 16:29:55jendriksetmessages: + msg9326
2020-05-12 16:02:22maltesetmessages: + msg9325
2020-05-12 15:57:43silvansetmessages: + msg9324
2020-05-12 15:55:00jendriksetmessages: + msg9323
2020-05-12 15:53:06jendriksetmessages: + msg9321
2020-05-12 15:36:27maltesetmessages: + msg9320
2020-05-12 15:27:42silvansetmessages: + msg9318
2020-05-12 14:51:53jendriksetstatus: reviewing -> resolved
messages: + msg9311
2020-05-12 14:46:06maltesetmessages: + msg9309
2020-05-12 14:45:47jendriksetmessages: + msg9308
2020-05-12 14:10:49maltesetmessages: + msg9307
2020-05-12 13:30:42jendriksetmessages: + msg9306
2020-05-12 12:52:15maltesetmessages: + msg9303
2020-05-11 19:32:35maltesetmessages: + msg9298
2020-05-08 16:21:04jendriksetmessages: + msg9291
2020-05-06 00:15:13jendriksetmessages: + msg9270
2020-05-05 18:03:14silvansetmessages: + msg9266
2020-05-04 13:07:54jendriksetmessages: + msg9265
2020-05-04 12:30:00maltesetmessages: + msg9264
2020-05-04 12:25:17silvansetmessages: + msg9263
2020-05-04 11:52:57jendriksetmessages: + msg9262
2020-05-04 11:07:50maltesetmessages: + msg9261
2020-05-04 11:01:40jendriksetmessages: + msg9260
2020-05-04 10:53:07maltesetmessages: + msg9259
2020-05-04 10:52:42maltesetmessages: + msg9258
2020-05-04 10:30:11gabisetmessages: + msg9257
2020-05-04 10:04:29jendriksetmessages: + msg9256
2020-05-04 09:47:03gabisetmessages: + msg9255
2020-05-02 14:12:33jendriksetstatus: in-progress -> reviewing
messages: + msg9253
2020-04-23 21:27:26maltesetmessages: + msg9250
2020-04-23 21:21:46jendriksetnosy: + gabi
messages: + msg9249
2018-04-30 09:33:53jendriksetmessages: + msg7102
2018-04-29 12:42:55silvansetmessages: + msg7101
2018-04-29 12:34:20maltesetmessages: + msg7100
2018-04-29 12:23:22maltesetmessages: + msg7099
2018-04-29 11:43:05maltesetmessages: + msg7098
2018-04-29 11:42:54jendriksetmessages: + msg7097
2018-04-29 11:40:46jendriksetmessages: + msg7096
2018-04-29 11:31:50silvansetmessages: + msg7095
2018-04-29 11:29:57maltesetmessages: + msg7094
2018-04-29 09:48:16jendriksetmessages: + msg7092
2018-04-28 21:00:51maltesetmessages: + msg7091
2018-04-28 12:48:27maltesetmessages: + msg7090
2018-04-28 12:45:36maltesetmessages: + msg7089
2018-04-28 10:40:31jendriksetmessages: + msg7088
2018-04-27 20:47:45maltesetmessages: + msg7087
2018-04-27 10:37:51jendriksetmessages: + msg7086
2018-04-26 10:59:46maltesetmessages: + msg7083
2018-04-26 10:54:25maltesetmessages: + msg7082
2018-04-26 07:24:34jendriksetmessages: + msg7081
2018-04-25 23:43:55jendriksetstatus: chatting -> in-progress
assignedto: jendrik
messages: + msg7080
2018-04-25 11:00:50silvansetnosy: + silvan
2018-04-24 22:34:41maltesetstatus: unread -> chatting
messages: + msg7075
2018-04-24 17:44:25jendrikcreate