Issue780

Title merge code from saturated cost partitioning papers
Priority wish Status reviewing
Superseder Nosy List jendrik, malte, salome, silvan, thomas
Assigned To jendrik Keywords
Optional summary
Pull request: https://github.com/aibasel/downward/pull/6

TODO:

* Discuss code.

Created on 2018-04-22.17:18:37 by jendrik, last changed by jendrik.

Summary
Pull request: https://github.com/aibasel/downward/pull/6

TODO:

* Discuss code.
Files
File name Uploaded Type Edit Remove
callgrind.out.7907 jendrik, 2018-07-28.14:28:16 application/octet-stream
Messages
msg9589 (view) Author: jendrik Date: 2020-07-10.20:39:49
Update TODOs and add link to new pull request.
msg9254 (view) Author: jendrik Date: 2020-05-02.15:39:23
In order to make the reviewing simpler, I removed the following functionality:

- Projections
- Greedy orders
- Optimizing orders with hill climbing
- Unsolvability heuristic (allowing to store unsolvable states once per abstraction instead of once per order)

That means the code now supports computing diverse, random orders over Cartesian abstractions. The change reduces the number of files to review from 35 to 24 and the total line count count from 3115 to 1703 lines.

Here is the link to the pull request again: https://bitbucket.org/jendrikseipp/downward/pull-requests/88
msg9215 (view) Author: salome Date: 2020-03-19.11:02:05
Since I did not really have a chance to take a good look at the code yet I think it does not make sense for me to be part of this discussion at the moment, so feel free to meet without me.
msg9214 (view) Author: jendrik Date: 2020-03-19.10:01:32
Sounds good. Should we schedule a video meeting for this? Maybe we can hold this discussion right after the Fast Downward meeting today.
msg9213 (view) Author: silvan Date: 2020-03-19.09:55:11
Thomas and I already discussed 3 weeks back that we would want to have a more general discussion on the design of the code. Both of us have worked with the code base and found some things to be difficult to understand/use. So far, we haven't found the time to have this discussion. But I would say that I'm quite familiar now with the code except of the order generation, which I didn't need to touch. So I'm not sure splitting up the review is the right next step, because I think that before looking at the code quality, we need to discuss the overall functionality/design.
msg9212 (view) Author: jendrik Date: 2020-03-18.13:45:16
Have you already discussed how to split up the code review? If not, what do you think about the following split?

Salomé:
  cost_partitioning_heuristic.cc
  cost_partitioning_heuristic_collection_generator.cc
  diversifier.cc
  max_cost_partitioning_heuristic.cc
  saturated_cost_partitioning_heuristic.cc
  unsolvability_heuristic.cc
  utils.cc

Silvan:
  abstraction.cc
  abstraction_generator.cc
  cartesian_abstraction_generator.cc
  explicit_abstraction.cc
  projection.cc
  projection_generator.cc

Thomas:
  greedy_order_utils.cc
  order_generator.cc
  order_generator_greedy.cc
  order_generator_random.cc
  order_optimizer.cc
msg9207 (view) Author: jendrik Date: 2020-02-28.19:07:16
Thanks for offering to review the code, guys! Salomé also agreed to help with the review.
msg9206 (view) Author: malte Date: 2020-02-28.15:08:45
If I recall our recent discussions on this correctly, Silvan mentioned essentially the same points as me, so perhaps it would be good if, at this point, I would not be part of the critical path.
msg9205 (view) Author: thomas Date: 2020-02-28.14:59:21
I can have a look at the code, but Malte didn't seem too happy with just me reviewing this. Silvan offered to help, but I don't know if that will be sufficient. As this will be quite some work, I think we should agree first how this should be reviewed.
msg9148 (view) Author: jendrik Date: 2020-01-13.20:21:05
I merged the default branch into the issue branch. Thomas, could you have a look at the code, please?
msg9041 (view) Author: jendrik Date: 2019-12-01.19:39:32
We talked about this offline a while ago, but I didn't update the status: we agreed to defer the improvements from msg8428 and would like to have another  code review before we merge. Before the review, I'll merge the latest changes from the default branch into the issue branch. I'm updating the summary accordingly.
msg8428 (view) Author: malte Date: 2019-01-04.18:05:34
> I'd be interested in knowing whether you see any other possible performance
> improvements for the Projection::for_each_transition() method. You can find the
> code for this method at
>
https://bitbucket.org/jendrikseipp/downward/pull-requests/88/issue780/diff#Lsrc/search/cost_saturation/projection.hT106
.

I haven't looked at this deeply, only at this specific method. I haven't yet
looked at the details that are not immediately visible from the method, i.e. how
"get_slice" and "increment_to_next_state" are implemented.

A few things that come to mind:

1. From the documentation, it is not clear to me which operator this loops over.
Depending on the use cases, four versions of this may make sense:

A. Consider all transitions, including from irrelevant operators.
B. Consider all transitions of relevant operators, including operators that only
relevant for preconditions (and hence only cause self-loops).
C. Consider all transitions of operators that affect a variable in the
projection, including self-looping transitions.
D. Only consider state-changing transitions.

The difference between C and D is perhaps slight, but we should be precise in
how we name/document this. I assume that A can be implemented much more cheaply
and B somewhat more cheaply than C/D. The documentation says "all transitions
including irrelevant transitions", but it's not so clear what this means.

2. This won't help with performance, but this looks like a model example for
using a do-while loop rather than a while loop.

3. I would have to look at the details, but keeping track of the values of state
variables inside the loop seems wrong from a performance perspective.
"increment_to_next_state" should be implementable as something that just takes a
(hashed) state ID, also replacing lines 123-125 of the linked-to code. You could
then avoid the two kinds of "returned values" (bool plus modified by-reference
parameter) by just returning a state and using a special value for "no more
state", but that's just stylistic. Basically, apart from the callback call, the
inner loop should be amortized time O(1) rather than O(no. abstract facts). I'm
not sure it will make a giant difference because we presumably don't usually
have that many abstract facts, but I think it's still the right thing to do.

4. It might help to make the functions called within the loop inlinable.

5. I expect that it many cases, we get lots of operators that behave
identically, or perhaps identically apart from the cost. For example, it's
common to have lots of operators of the form "X = a, Y = b => Y := c", i.e. a
state change of Y conditioned on X. If we project away X, many of these
operators might end up behaving identically. In cases where the exact operator
doesn't matter, this may be worth detecting and then only working with one
representative. So again multiple cases here; one where we really want all
transitions, one where we want a weighted but unlabeled transition system (and
perhaps then only the minimum cost of each transition), one where we want an
unweighted and unlabeled transition system, or perhaps one where we do care
about operator identity (for example for cost partitioning), but still would
like to process operators together that are equivalent. (So the main work would
work with one representative, but there would be a way to find out which
operators are represented by it.)

So in summary, apart from #3, the main idea is not actually to consider all
transitions if we don't actually care about them, similarly to the way
equivalent labels are used in merge-and-shrink, or the way we partition
operators into groups in post-hoc optimization. This obviously depends a lot on
the use cases for this, so I think it requires a look at a slightly larger picture.

I think it's an interesting problem to look at and I'd be happy to look at this
together, but it's probably of a scope where it's best to be able to talk in person.
msg8427 (view) Author: jendrik Date: 2019-01-03.14:50:50
I noticed that storing the necessary information for the
Projection::for_each_transition() method leads to using a lot of memory for
tasks with many operators (such as tasks from the tetris domain). To reduce the
memory footprint I used the new ArrayPool class:

v7: version after handling Silvan's comments and merging with the default branch
v8: intermediate version, which we can ignore
v9: using an ArrayPool to store which variables are unaffected by abstract operators

Here is the diff between v7 and v9:
https://bitbucket.org/jendrikseipp/downward/branches/compare/issue780-v9..issue780-v7#diff

The change leads to using less memory

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue780-v9-issue780-v7-issue780-v9-search_start_memory-scp-sys2-cart-1K-orders.png

and also reduces the time needed for computing the cost partitionings:

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue780-v9-issue780-v7-issue780-v9-time_for_computing_cost_partitionings-scp-sys2-cart-1K-orders.png

Coverage numbers vary by at most +- 2 per domain, which I wouldn't worry about.
Here is the report:

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue780-v9-issue780-v7-issue780-v9-compare.html

I'd be interested in knowing whether you see any other possible performance
improvements for the Projection::for_each_transition() method. You can find the
code for this method at
https://bitbucket.org/jendrikseipp/downward/pull-requests/88/issue780/diff#Lsrc/search/cost_saturation/projection.hT106
.
msg8356 (view) Author: jendrik Date: 2018-12-14.17:49:00
I did the improvements we discussed offline. Regarding Silvan's questions:

> I looked at the overall structure of the to-be-added subdir and left some
> comments at bitbucket. I think this needs a bit further discussion as of what
> this subdir (aka. plugin?) really wants to support now and in future, to decide
> on some of the names that are used here.

We discussed the names offline and made some file and variable names clearer.
Currently, the directory is called "cost_saturation" and we decided to keep the
name for now since the directory only contains the code for saturated cost
partitioning. If we add optimal cost partitioning or zero-one cost partitioning
later, we should rename the directory.

> As discussed offline, we also shouldn't forget that the SCP implementation of
> CEGAR could switch to use this new code base.

Currently, the code in "cost_saturation" uses the code in "cegar", but not the
other way around. I think this should stay the way it is. We need two different
data structures for representing Cartesian abstractions efficiently: one for
refining abstractions (where we store incoming and outgoing transitions for each
state) and one for computing goal distances and saturated cost functions (where
we only store "backward" transitions).

> I didn't look at all of the order-related classes in any detail (or at all), and
> neither did I go through the projection class very thoroughly. For the latter,
> we should definitely discuss if we want to merge it in this state, with a lot of
> duplication from PDB *and* between forward-variants of abstract operators and
> the normal (should be called backward?) variant.

I removed the code duplication between forward and backward abstract operators.
We could also remove some of the code duplication between the new Projection and
the old PatternDatabase class, but this would mean changing the existing code,
which I'd rather defer to a separate issue. Do you agree, Malte?
msg8351 (view) Author: malte Date: 2018-12-14.16:19:31
I don't necessarily have to do a full code review, but I would like to know the
answers to the points raised in your msg8270.
msg8350 (view) Author: silvan Date: 2018-12-14.15:54:56
We discussed some further improvements offline. Once they are done, this could
be integrated from our perspective.

However, there is still the point of adding that extra concrete_operator_id to
AbstractOperator of PatternDatabase (see msg7338). And of course the question if
you, Malte, also want to have a look at the pull-request.
msg8270 (view) Author: silvan Date: 2018-12-12.18:42:38
I looked at the overall structure of the to-be-added subdir and left some
comments at bitbucket. I think this needs a bit further discussion as of what
this subdir (aka. plugin?) really wants to support now and in future, to decide
on some of the names that are used here.

As discussed offline, we also shouldn't forget that the SCP implementation of
CEGAR could switch to use this new code base.

I didn't look at all of the order-related classes in any detail (or at all), and
neither did I go through the projection class very thoroughly. For the latter,
we should definitely discuss if we want to merge it in this state, with a lot of
duplication from PDB *and* between forward-variants of abstract operators and
the normal (should be called backward?) variant.
msg7880 (view) Author: malte Date: 2018-10-04.17:44:13
Emails are coming in faster than I can deal with them at the moment, even though
I'm spending almost the whole day on email. :-) This will need some thought. If
you don't hear from me in the next few days, perhaps it's better to add this to
the review queue.
msg7876 (view) Author: jendrik Date: 2018-10-04.10:22:27
Before work can continue here, we have to find an answer to the question from 
msg7338. Malte, what's your opinion?
msg7825 (view) Author: jendrik Date: 2018-10-01.16:42:08
I have merged the latest revision from master into the issue branch.
msg7338 (view) Author: jendrik Date: 2018-08-08.21:05:50
I implemented your three suggestions and at least for the example task the new version based on a forward sweep is now up-to-
par with the match tree backward sweep version (the old version needs ~15s for computing 2000 cost partitionings, while the 
new version needs ~14s).

I ran an experiment on to measure the effect across more tasks [1]. The time_for_computing_cost_partitionings attribute shows 
the time needed for computing 1000 saturated cost partitionings. This includes the time for computing the abstract goal 
distances. Both the old and the new version are preferable to the other version in many domains. I'd be interested to know 
whether you see any ways of making the for_each_transition() function [2] faster. If not, I think we can also stick to the 
current version.

Even in the new version, we need the AbstractOperator::get_concrete_operator_id() method for computing abstract goal 
distances under a given cost function. I see three possible ways of dealing with this:

1) We store the concrete operator ID in AbstractOperator. 
2) We store AbstractOperators belonging to different concrete operators in different vectors and allow changing 
AbstractOperator::cost.
3) We introduce a new class AbstractBackwardOperator that stores the concrete operator ID and no regression preconditions and 
write a new MatchTree class for it.

Which solution do you prefer?

[1] http://ai.cs.unibas.ch/_tmp_files/seipp/issue780-v6-issue780-v6-merge-issue780-v6-compare.html
[2] https://bitbucket.org/jendrikseipp/downward/pull-requests/88/issue780/diff#Lsrc/search/cost_saturation/projection.hT119
msg7336 (view) Author: malte Date: 2018-07-30.11:14:47
> Let's discuss the details of this offline.

Sorry, didn't take into account you were going on holiday. I had a look at the
latest commits on bitbucket, and at first glance they look like they are what I
had in mind (but no warranty, I didn't look at it very closely). Three
efficiency comments:

1) It may be worth converting recursion to iteration here.
2) It may be worth using templates instead of a callback.
3) It may be worth thinking about the order in which state variables are
considered in certain places to optimize cache efficiency. Generally speaking,
we want to move forward in small increments in the space of hash values. It may
be that the default already gets this right anyway, and it may be that the
variables should be considered in reverse order in some places to achieve this.

If this doesn't pan out, I'd be interested in looking into it further at some
point because it is an interesting data structure problem, and a forward sweep
should be faster than a backward sweep using match trees for this. But of course
if it turns out this is difficult to make efficient we might want to revert to
an approach based on match trees at some point.
msg7335 (view) Author: malte Date: 2018-07-28.16:27:40
I had a brief look at the code; it is only part of what I had in mind. If I see
it correctly, it tests each operator on each abstract state for applicability,
so will take time at least O(N * M) for N operators and M abstract states. What
I had in mind is not to test the preconditions, but to systematically generate
the states (in terms of their abstract hash values) that satisfy the
preconditions so that they do not need to be tested.

Additionally, depending on how many cost partitionings have to be computed and
how much memory is available, it may or may not be a good idea to precompute
more information about the operators. Not sure. Perhaps better not to do it at
first.

Basically, you want to keep as many computations as possible out of the inner
loop of the test. This also includes the effect that the operator has on the
perfect hash value. For operators in TNF, this can be fully precomputed. For
operators not in TNF, it may be worth multiplying out internally, in the same
way that the MatchTree does (for the same reason).

Let's discuss the details of this offline.
msg7334 (view) Author: jendrik Date: 2018-07-28.14:28:16
I have implemented the idea Malte suggested in issue786 (msg7330) for computing the transitions in a projection without a match tree. 
However, the new code runs considerably slower than the old match tree version. Using the following invocation, the old code took ~1.5 
seconds for computing the 200 cost partitionings, whereas the new code needs ~10 seconds. This could become a problem when using bigger 
or more abstractions.  

./fast-downward.py --debug ../benchmarks/depot/p05.pddl --search "astar(saturated_cost_partitioning([projections(systematic(2))], 
max_orders=200, max_optimization_time=0, diversify=false), bound=0)"

I have already sped up the code by reducing the number of accesses to the proxy objects. For further speed improvements, I only see the 
possibility of precomputing and storing the preconditions and effects of active operators in each projection (as vectors of 
vector<FactPair>), but this seems like a step in the wrong direction and I doubt that it would make the code as fast as before again. 

Would you be fine with storing the mapping from AbstractOperators to concrete operator IDs in the saturated cost partitioning code 
(leaving the PDB code as it is)? Then we could use the match tree for computing the transitions as before.

Or do you see a different way of speeding up the implementation? The bottleneck is the for_each_transition() function [1]. See also the 
attached callgrind profile: computing the saturated costs takes 77.73% of the time, computing goal distances takes 13.47% of the time 
and both functions are called equally often (42050 times).

[1] https://bitbucket.org/jendrikseipp/downward/pull-requests/88/issue780/diff#Lsrc/search/cost_saturation/projection.ccT303
msg7122 (view) Author: malte Date: 2018-05-07.18:12:14
Hi Jendrik, with roughly 3000 new code lines, I would like to see this separated
into at least two changes:

1) one that contains the changes to the existing code that is necessary to
prepare the new code
2) one that contains only the new stuff

I'd like to review #1, but am less worried about #2, although it would of course
be good if someone could give it a light review.

If #1 contains some changes that affect many lines but don't do much (like
moving a function somewhere else), these should also be separated out if possible.

Regarding the code duplication, it would be good to avoid that if possible. Some
overhead for the old code is acceptable; it depends on how much it is. I can
give you a clearer answer with a more specific patch. Ditto for the added
operator ID: it doesn't sound like a problem to me, but I want to see a smaller
patch and some evaluation before saying something definitive there.
msg7121 (view) Author: jendrik Date: 2018-05-07.18:00:55
I made a pull request at https://bitbucket.org/jendrikseipp/downward/pull-requests/88 .

There a some open questions: the new code adds a class "Projection" that duplicates some 
code of the "PatternDatabase" class. One way to reduce the code duplication could be to 
turn some of the methods into functions. We could also reduce the code duplication in a 
separate issue. Also, the code stores the original operator ID in each 
pdbs::AbstractOperator. This makes it easy to compute saturated cost partitionings and we 
can reuse the pdbs::MatchTree without any modifications. On the other hand it's not ideal 
that the existing PDB doesn't need the original operator IDs and the new code doesn't need 
the pdbs::AbstractOperator::cost values.    

How would you like to proceed with this issue, Malte? Should I find a reviewer or add the 
issue to the review queue?
msg7066 (view) Author: jendrik Date: 2018-04-22.17:18:37
I'd like to merge the code for computing diverse, greedy and optimized saturated 
cost partitionings over Cartesian abstraction heuristics and PDB heuristics.
History
Date User Action Args
2020-07-10 20:39:49jendriksetmessages: + msg9589
summary: TODO: * Let someone review the code again. -> Pull request: https://github.com/aibasel/downward/pull/6 TODO: * Discuss code.
2020-05-02 15:39:23jendriksetmessages: + msg9254
2020-03-19 11:02:05salomesetmessages: + msg9215
2020-03-19 10:01:33jendriksetmessages: + msg9214
2020-03-19 09:55:11silvansetmessages: + msg9213
2020-03-18 13:45:16jendriksetmessages: + msg9212
2020-02-28 19:07:16jendriksetnosy: + salome
messages: + msg9207
2020-02-28 15:08:45maltesetmessages: + msg9206
2020-02-28 14:59:22thomassetmessages: + msg9205
2020-01-13 20:21:05jendriksetstatus: in-progress -> reviewing
messages: + msg9148
summary: TODO: * Merge default branch into issue branch. * Let someone review the code again. -> TODO: * Let someone review the code again.
2019-12-01 19:39:32jendriksetstatus: reviewing -> in-progress
messages: + msg9041
summary: TODO: * Merge default branch into issue branch. * Let someone review the code again.
2019-01-04 18:05:34maltesetmessages: + msg8428
2019-01-03 14:50:50jendriksetmessages: + msg8427
2018-12-14 17:49:00jendriksetmessages: + msg8356
2018-12-14 16:19:31maltesetmessages: + msg8351
2018-12-14 15:54:56silvansetmessages: + msg8350
2018-12-12 18:42:38silvansetmessages: + msg8270
2018-10-04 17:44:14maltesetmessages: + msg7880
2018-10-04 10:22:27jendriksetmessages: + msg7876
2018-10-01 16:42:08jendriksetmessages: + msg7825
2018-08-08 21:05:50jendriksetmessages: + msg7338
2018-07-30 11:14:47maltesetmessages: + msg7336
2018-07-28 16:27:40maltesetmessages: + msg7335
2018-07-28 14:28:17jendriksetfiles: + callgrind.out.7907
messages: + msg7334
2018-06-04 14:17:37silvansetnosy: + silvan
2018-05-07 18:12:14maltesetmessages: + msg7122
2018-05-07 18:00:55jendriksetstatus: in-progress -> reviewing
messages: + msg7121
2018-04-23 09:14:52thomassetnosy: + thomas
2018-04-22 17:18:37jendrikcreate