Issue927

Title M&S refactoring part 2: get rid of LabelEquivalenceRelation and use label mappings local to TransitionSystem instead
Priority feature Status reviewing
Superseder Nosy List atorralba, jendrik, malte, salome, silvan
Assigned To silvan Keywords
Optional summary
Part of issue567, and to some extent follow-up of issue851.

Pull request: https://github.com/aibasel/downward/pull/11

Possible follow-up issue:
- rework EquivalenceRelation which can be the bottleneck of the merge-and-shrink computation in certain tasks

Created on 2019-09-18.15:34:48 by silvan, last changed by malte.

Summary
Part of issue567, and to some extent follow-up of issue851.

Pull request: https://github.com/aibasel/downward/pull/11

Possible follow-up issue:
- rework EquivalenceRelation which can be the bottleneck of the merge-and-shrink computation in certain tasks
Messages
msg10244 (view) Author: malte Date: 2021-04-09.12:04:55
> Apparently, the overhead of that method, which creates a HashState and calls
> feed, is too large when hashing plain data types like int. I think we should
> check if the planner uses HashMap/HashSet with int or similar data types at
> other places. Also, we should document that using the stl data structures
> directly is preferred in such cases.

I think this is not the first time we discover this. These hash functions are designed with container objects in mind and with an emphasis on distributing as uniformly as possible. They are certainly not cryptographic hash functions, but in the spectrum of "cheap computation vs. uniform distribution", they emphasize uniform distribution more than the typical library implementations.

What makes this even more complicated is that trivial hash functions like
    [](int i) { return i; }
that can be very problematic for distribution (for mod-based hash tables) can be excellent for cache locality in typical scenarios where we see many keys clustered close to each other.

We also have the dedicated IntHashSet type that can be useful in certain contexts, but I think it is primarily designed to be memory-efficient for very large hash tables, not to be fast for small ones. (It is used only by the state registry.)

The question is how best to document things like these. If we just mention this in hash.h, I am not sure we will see it when we need to.
msg10243 (view) Author: silvan Date: 2021-04-09.10:55:21
It turns out that the culprit of the performance loss in v12e was not the simplification of EquivalenceRelation, but the switch from std::unordered_map to utils::HashMap. I got skeptical when "std::uint64_t get_hash64(const T &value)" showed up in a profile. Indeed, reverting this change in v12e fixes the performance loss:
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v12e-issue927-v12d-issue927-v12e-compare.html
and also mildly improves over 12c, although the scale of changes could be in the range of noise:
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v12e-issue927-v12c-issue927-v12e-compare.html

Apparently, the overhead of that method, which creates a HashState and calls feed, is too large when hashing plain data types like int. I think we should check if the planner uses HashMap/HashSet with int or similar data types at other places. Also, we should document that using the stl data structures directly is preferred in such cases.

Furthermore, when looking at a profile of the first task of parking-opt14-strips, we noticed that EquivalenceRelation::refine is responsible for 87% of the run-time of the first three merge-and-shrink iterations. It might be worth looking into completely reworking EquivalenceRelation in another issue. I added an item to the summary.
msg10242 (view) Author: silvan Date: 2021-04-06.19:23:37
v12: only refactoring in comparison to v9-fix (reported in the previous comment):
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v12-issue927-base-v3-issue927-v12-compare.html
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v12-issue927-v9-fix-issue927-v12-compare.html
By now, I assume that changes in runtime scores of up to 2 can just be attributed to noise in the sense of code changes that should not impact the performance, but might affect memory layout or whatever else.

v12b: removes the class GlobalLabel (which only contained an int standing for the cost of the label) and GlobalLabels stores vector<int> label_costs directly instead of unique pointer to GlobalLabel:
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v12-issue927-v12-issue927-v12b-compare.html
If anything, I expected a positive impact because of one less layer of indirection and not using unique pointers. But the effect is "negative", in the same amount as v12 compared to v9-fix was positive even though this was pure refactoring.

v12c: add iterators for GlobalLabels which allows iterating over "valid" labels, which, however, is only used during the construction of the atomic FTS:
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v12-issue927-v12b-issue927-v12c-compare.html#summary
Basically no effect, was expected.

v12d: Florian and I greatly simplified the class EquivelanceRelation (in algorithms). It only works with ints now, only can be constructed as the universal equivalence relation, and no longer supports "unspecified elements". This allows using it with only valid labels, which should avoid having reduced labels as "unspecified elements" in the equivalence relation. 
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v12-issue927-v12c-issue927-v12d-compare.html#summary
The results are bad and I don't know (yet) why. We moved previously templated methods to the cc file, maybe this has an effect. We'll need to investigate this further.
msg10241 (view) Author: silvan Date: 2021-04-05.18:49:10
I noticed that the loop that erases old labels from transition systems when applying label reductions unnecessarily considered all labels of a label group instead of stopping after erasing the label. This significantly improves the performance:
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v9-fix-issue927-base-v3-issue927-v9-fix-compare.html
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v9-fix-issue927-v9-issue927-v9-fix-compare.html
msg10238 (view) Author: silvan Date: 2021-03-30.10:24:39
After merging the latest changes from main (v9), the changes in this issue compare less favourably to main (base-v3) than previously, although overall still positively: 
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v9-issue927-base-v3-issue927-v9-compare.html
(successful constructions for more tasks, less memory consumption, often faster, sometimes significantly slower, however).

Compared to the previous version (v8), the merge-and-shrink construction slows down a bit (ms_construction_time, score_ms_construction_time): 
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v9-vs-v8-issue927-v8-issue927-v9-compare.html
I don't really understand why, as all changes between v8 and v9 are due to changes in main. 

I therefore also compared the two main revisions against each other: 
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-base-v2-vs-v3-issue927-base-v2-issue927-base-v3-compare.html
The results of this comparison are very surprising to me because I don't know what could cause this difference. The diff between the two versions only makes cosmetic changes in M&S. The major things we changed in Fast Downward, like the state representation, shouldn't affect the construction of M&S heuristics.

Since the latest version still uses much less memory and only becomes significantly slower in a few domains, I think we can accept that strange performance change. (We cannot do something about the weird change due to main anyway.) Instead, when discussing this issue with Florian and Salomé, we thought it would be a good idea to look at the problematic domains, in particular, agricola, where construction time can be doubled compared to main. 

I did that and found that applying label reductions to transition systems is the culprit: where we previously stored , for each label, the id of its "local label group" and an iterator into that group (a list), we now only store the group id and need to search for the label in its group (now a vector) for removing it. I attempted to improve the situation using set (v10) or HashSet (v11). While this helped in the previously problematic case agricola (to some extent, only), it significantly slowed down the computation in other domains: 
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v10-v11-issue927-base-v3-issue927-v10-compare.html
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v10-v11-issue927-base-v3-issue927-v11-compare.html
So I think using vectors (v9) is the right direction and we should discuss if we can live with slower construction times in a few domains. Alternatively, we could (again) store an iterator into a label group for each label, but I didn't like that previous part of the code too much, so I would rather try to avoid it.

Florian, Salomé and I also discussed if reverting the change that all "local label ids" (i.e., the label groups) are stored contiguously (added in v3). This might trade some memory usage for runtime, although back then we didn't observe too much penalty for enforcing the contiguous storing of label groups.
msg9714 (view) Author: malte Date: 2020-09-17.09:23:09
Sounds good to me. The cost partitioning thing is not a blocker or an overriding concern, just a thing I suggest we keep in mind. It is a goal to open these things up a bit more given how much the BDE project focuses on this aspect.
msg9713 (view) Author: silvan Date: 2020-09-16.18:38:01
The new implementation would make implementing optimal cost partitioning easier. For SCP, it would probably not help but could make it more complicated as we need to compute the remaining costs of the actual labels, not some labels "local" to a transition system.

Nevertheless, I'm not sure if this consideration (regarding cost partitioning) should be a central concern in this issue because I think that the change in this issue makes the code related to transition systems easier in general. But of course, getting someone else to review this first would be good. I hoped Alvaro could have a look into this since he also would like to see this change, I think, but he might be too busy in Denmark now. I'll ask Florian/Thomas if they can find some time.
msg9712 (view) Author: malte Date: 2020-08-31.17:53:51
Hi Silvan, to move this forward, it would be good if someone else could look at the  code first.

This is currently an implementation detail of merge-and-shrink, but it may become more visible when we add more cost partitioning or operator-counting constraint features. Does the new representation help with that, hinder that, or is it neutral? Would it make sense to get someone else who was included in cost partitioning and merge-and-shrink to review?
msg9343 (view) Author: silvan Date: 2020-06-09.09:30:59
I evaluated the current implementation (with all improvements together) after merging from default:
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v8-issue927-base-v2-issue927-v8-compare.html
msg9332 (view) Author: silvan Date: 2020-05-24.16:00:21
I looked into this again today but couldn't actually find anything else I would like to do in this issue. So now, with an unnecessary delay of 8 months :-), I'd like to ask for a review. Or, as a first step, for ideas of how to simplify reviewing in this case. The pull request list the following changes:

+0 -59
M src/search/algorithms/equivalence_relation.cc
+64 -1
M src/search/algorithms/equivalence_relation.h
+3 -7
M src/search/merge_and_shrink/distances.cc
+8 -9
M src/search/merge_and_shrink/factored_transition_system.cc
+5 -5
M src/search/merge_and_shrink/factored_transition_system.h
+69 -47
M src/search/merge_and_shrink/fts_factory.cc
+72 -0
A src/search/merge_and_shrink/global_labels.cc
+47 -0
A src/search/merge_and_shrink/global_labels.h
+0 -134
D src/search/merge_and_shrink/label_equivalence_relation.cc
+0 -122
D src/search/merge_and_shrink/label_equivalence_relation.h
+3 -4
M src/search/merge_and_shrink/label_reduction.cc
+0 -62
D src/search/merge_and_shrink/labels.cc
+0 -46
D src/search/merge_and_shrink/labels.h
+4 -1
M src/search/merge_and_shrink/merge_and_shrink_algorithm.cc
+1 -2
M src/search/merge_and_shrink/merge_scoring_function_dfp.cc
+1 -3
M src/search/merge_and_shrink/shrink_bisimulation.cc
+270 -129
M src/search/merge_and_shrink/transition_system.cc
+50 -52
M src/search/merge_and_shrink/transition_system.h 

The files labels.* and global_labels.* should be reviewed next to each other (the new class is a bit simpler) and some of the logic of what used to be in label_equivalence_relation is now part of transition_system (and fts_factory, which creates transition systems), however also modified as described below. The change to equivalence_relation is due to making refine being a templated method (therefore in the header). The rest are only small changes required to adapt to the changed interface of transition systems.
msg8967 (view) Author: silvan Date: 2019-09-26.14:32:44
Added link to meta issue567.

I just found that issue851 can be seen as a predecessor of this issue. To quote its summary: 
Possible further improvements: do not store information for all labels in each
transition system, but only for the relevant ones. This requires either a
concept of local label numbers and a mapping between global and local labels,
or using hash maps instead of vectors whenever storing information per label,
leaving out irrelevant labels.
msg8964 (view) Author: silvan Date: 2019-09-21.17:35:43
I'll you know, yes.

In step 3, after reducing labels and shrinking, we always make sure that there are no gaps in vector<vector<Transition>>, i.e., we have a contiguous "numbering" of "local labels". 

The effect can be both negative or positive, especially in large instances, sometimes reducing memory and speed, sometimes trading the former for the latter: https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v3-issue927-v2-issue927-v3-compare.html
msg8963 (view) Author: malte Date: 2019-09-20.21:36:42
Hi Silvan, I'm curious, but lack the time, so I'll stay quiet for the time being.  But sounds exciting! Let us know when something needs to be done in terms of reviewing etc.
msg8962 (view) Author: silvan Date: 2019-09-19.11:41:38
In a small second step, I did some more minor refactoring to avoid having a global label mapping from global to local labels, but instead having it individually for each transition system. No large changes: https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v2-issue927-v1-issue927-v2-compare.html
msg8961 (view) Author: silvan Date: 2019-09-18.22:04:06
The results of the first step are in. The only change is to replace LabelEquivalenceRelation (which uses a hash map to map global labels to local label group ids and a list for said groups) by a combination of global to local and local to global label mappings in each transition system, only using vectors.

https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue927-v1-issue927-base-issue927-v1-compare.html

There is a clear positive performance trend and we can finish the computation of atomic FTS and the whole heuristic in more cases than before. (The attributes ms_out_of_memory and ms_out_of_time were not correctly parsed as of this writing; the report will be updated tomorrow.)

Here's a pull request if someone is already curious: https://bitbucket.org/SilvanS/fd-dev/pull-requests/54/issue927/diff
msg8960 (view) Author: silvan Date: 2019-09-18.15:34:48
In a recent discussion with Álvaro, we came to the conclusion that it would be worth trying to change the representation of transition systems. In particular, getting rid of LabelEquivalenceRelation and its underlying data structure of using a hash map from labels to local label group ids, and using lists for label groups. Using the notion of "local labels" and vector<int> to store both a mapping from "global" to local labels and vice versa could not only help simplify the code greatly, but also potentially improve performance.

In a first step, we will replace LabelEquivalenceRelation by appropriate mappings (vector<int>) as just described. Later, we could also try to always keep local and/or global label numbers contiguous and thereby get rid of the need for testing if a global label number is valid or not (because it has been reduced). In principle, we could also attempt the same thing for transition systems in factored transition systems. Finally, what is up for debate is whether we want to not store transitions of irrelevant labels, like it was the case years ago before the introduction of generalized label reduction.
History
Date User Action Args
2021-04-09 12:04:55maltesetmessages: + msg10244
2021-04-09 10:55:22silvansetmessages: + msg10243
summary: Part of issue567, and to some extent follow-up of issue851. Pull request: https://github.com/aibasel/downward/pull/11 -> Part of issue567, and to some extent follow-up of issue851. Pull request: https://github.com/aibasel/downward/pull/11 Possible follow-up issue: - rework EquivalenceRelation which can be the bottleneck of the merge-and-shrink computation in certain tasks
2021-04-06 19:23:38silvansetmessages: + msg10242
2021-04-05 18:49:10silvansetmessages: + msg10241
2021-03-30 10:24:41silvansetnosy: + salome
messages: + msg10238
2020-09-17 09:23:10maltesetmessages: + msg9714
2020-09-16 18:38:02silvansetmessages: + msg9713
2020-08-31 17:53:52maltesetmessages: + msg9712
2020-07-24 14:23:54silvansetsummary: Part of issue567, and to some extent follow-up of issue851. -> Part of issue567, and to some extent follow-up of issue851. Pull request: https://github.com/aibasel/downward/pull/11
2020-06-09 09:30:59silvansetstatus: in-progress -> reviewing
messages: + msg9343
2020-05-24 16:00:22silvansetmessages: + msg9332
2019-09-26 14:32:44silvansettitle: M&S: get rid of LabelEquivalenceRelation and use label mappings local to TransitionSystem instead -> M&S refactoring part 2: get rid of LabelEquivalenceRelation and use label mappings local to TransitionSystem instead
messages: + msg8967
summary: Part of issue567, and to some extent follow-up of issue851.
2019-09-21 17:35:43silvansetmessages: + msg8964
2019-09-20 21:36:42maltesetmessages: + msg8963
2019-09-19 11:41:38silvansetmessages: + msg8962
2019-09-19 09:40:57jendriksetnosy: + jendrik
2019-09-18 22:04:06silvansetmessages: + msg8961
2019-09-18 15:34:48silvancreate