Title Use task-interface for pruning methods
Priority wish Status resolved
Superseder Nosy List florian, jendrik, malte, martin
Assigned To florian Keywords
Optional summary

Created on 2016-01-26.12:02:19 by martin, last changed by florian.

msg6013 (view) Author: florian Date: 2017-01-05.18:02:32
The windows build broke during the merge but issue704 should fix this.
msg6010 (view) Author: malte Date: 2017-01-05.01:50:05
Yes, looks good to merge.
msg6009 (view) Author: florian Date: 2017-01-04.23:07:48
Performance looks great. Compared to the previous version, it actually got a
little faster. Coverage is mostly unaffected, but with the EC variant, we now
solve 3 more tasks compared to v9. Memory and the number of
generated/evaluated/expanded nodes is not affected as we expected. 

Are you fine with merging this?
msg5980 (view) Author: malte Date: 2016-12-22.22:03:30
OK, all looks good. If the latest revision doesn't significantly lose
performance, we can merge.
msg5978 (view) Author: malte Date: 2016-12-22.18:37:29
Sounds good. I'll have a closer look at the current experimental results later.
msg5977 (view) Author: florian Date: 2016-12-22.18:35:03
Thanks for the comments, I'll handle the latest batch of them tonight. 

I think we can use the existing experiments I added in msg5575 for discussing
performance. Once I'm done with your comments, I'll run another experiment
comparing version v9 to the tip of the branch to verify that I didn't mess up
performance since then. I think there were no changes since v9 that should
impact performance, so I don't expect any surprises there.
msg5974 (view) Author: malte Date: 2016-12-22.18:14:05
I'm done looking at the code. Due to the many large changes compared to the old
code, I don't really feel confident that the code still behaves the same ways a
before, doesn't crash, etc., but the experiments should show that. (Do we need
updated experiments?) The parts I could check like the interfaces and data
structures used etc. look good to me.

So I think we can merge once we're happy with the experiments. Should I look
into that now, or should we wait for the latest code changes and possibly run
some new experiments?
msg5609 (view) Author: florian Date: 2016-09-06.12:11:15
I updated the comment in commit 9bc6fff:

Issue556 is the only one related to variable ordering, that I could find. I'll add 
a cross reference.
msg5601 (view) Author: malte Date: 2016-09-05.19:40:30
I briefly discussed some algorithmic points with Florian. Most importantly:

1) The sorting of preconditions and goals is intentional and important.

In our ICAPS 2014 paper ("Efficient Stubborn Sets: Generalized Algorithms and
Selection Strategies"), Martin and I conducted a very limited study of this. One
of the outcomes was the sorted version ("static orders/FD" in Table 1 of the
paper) is dramatically better than choosing preconditions and goals randomly
every time ("dynamic orders/random" in Table 1). The coverage improvement by
sorting rather than random choice is +42, +28, +38 and +24 for the different
other parameter choices in Table 1.

2) I think at some time in between we broke the intended sorting order in minor
ways (probably when we got rid of the prevail/precondition distinction in the
search component). I think Florian's new code, which always sorts, restores the
original intention, so this is fine despite the small loss of coverage.

3) This is also one of the places in the planner that relies on the fact that
the variable order approximates a causal graph topological sort, i.e., the code
intentionally uses the "causal graph order" of variables rather than an
arbitrary variable order.

Hence, we wouldn't be surprised if performance dropped significantly with
different variable orders because there are some intuitive arguments why this
kind of ordering is good. (However, we never looked into this in much depth, so
I wouldn't be terribly surprised if this particular order isn't all that great
after all.)

In summary, this is one part of the code that *very intentionally* sorts the
task and that *very intentionally* uses the "causal graph order" (which
coincides with the variable numbers) for this sorting. We should document this
in the code, with a pointer to the ICAPS 2014 paper that discusses these matters
(in section "Strategies for Choosing Unsatisfied Conditions" and especially
subsection "Static Variable Orderings"), to help us not break this in the future.

Also, if we already have an issue open related to sorting task representations
or to variable ordering/getting rid of the implicit assumption that everything
is using the "causal graph order", I think we should add pointers there that
make sure we don't forget about the impact of these issues on the pruning code.
msg5600 (view) Author: florian Date: 2016-09-05.12:19:32
Thanks. I updated the PR including your suggestions.
msg5598 (view) Author: jendrik Date: 2016-09-05.10:52:28
I have left a few comments regarding style and general code organization on 
msg5575 (view) Author: florian Date: 2016-08-23.11:30:44
I ran a bunch of experiments to isolate the effects of different changes.

The first experiment compares the latest merged default revision
(issue629-v7-base) against a revision (issue629-v7) that includes the
main work from this issue. This revision does some additional work to
keep the order of preconditions the same as before. Both revisions also
sort the operators by their pointer value after pruning.

The effect is clearly positive. Both variants of stubborn sets become
faster (simple: ~5%, EC: ~25%) and solve more tasks (simple: 2, EC:
13). The control config (A*/blind without pruning) has the usual noise
within +-5% but no clear effect in one direction or the other, as

For the second experiment, I removed the data structure that was used
to fix the order of preconditions within the pruning methods and used
the sorted copy instead (issue629-v8). We use the sorted variant instead of the
original order, because we rely on the vector being sorted in some
other parts of the code.

For both pruning methods, the new order is worse in some domains
(openstacks, psr) and better in others (pegsol, tetris), but overall it
is worse and we lose some tasks (simple: 2, EC: 4).

The results for the control config are surprising: the code change is
only in code that is not executed in this config and still the
performance shows a clearly negative effect of around 2% in total time.
I have no idea what could have caused this. The runs of the experiment
were executed in randomized order, as usual. The only difference
between v7 and v8 is this commit which only touches the stubborn set

The third step removed the ordering of the operators after pruning (issue629-v9).
This has almost no effect in most cases, but a large effect in some
instances (e.g., mprime, tidybot, driverlog). In total, we lose one
task each in both pruning methods. The control is uneffected, as

Even though we lose some coverage by the latter two changes, I would
still suggest to include them. In both cases, one arbitrary order works
better than another. Optimizing for this seems like overfitting, so I
would opt for the version that has the cleaner code. There seems to be
some potential in looking at the order in which preconditions are
considered but this seems to be a separate issue.

Here is a report for the total change, comparing the last merged
version on default to the last version of the issue branch.

Martin, could you do a review of the changes, checking for correctness?
Jendrik, if you could also have a look to check our coding guidlines,
style, etc., that would be great. Maybe one of you can spot why the
second change affects blind search without pruning.
msg5549 (view) Author: florian Date: 2016-08-15.12:27:11
I got most of the performance back by reverting some of the changes that used the task interface in 
time-critical places. The previous code already had a local sorted copy of all operator 
preconditions and effects. During the tests I noticed that changing the method that looks for an 
unsatisfied precondition so that they use the sorted array, improves the pruning power on average 
(11 additional tasks solved).

I would like to continue this issue in two stages:
1) Make a version that uses the task interface but maintains the order of the previous code, so it 
has the same pruning power. This version should have comparable performance to the baseline.
2) Change the method that looks for unsatisfied preconditions to use the local copy and see if it 
still improves pruning power.

The current version of the pull request unfortunately has a small difference in pruning power for 
the EC variant. I'll try to find out why:
msg5459 (view) Author: florian Date: 2016-06-24.11:22:56
I started a pull request for this and ran some experiments, but we currently get a 30% 
performance hit in blind search. I'll try to figure out the reason it. I guess it could be the 
operator translation from GlobalOperator to OperatorProxy and back.
Date User Action Args
2017-01-05 18:02:32floriansetmessages: + msg6013
2017-01-05 16:07:19maltesetstatus: reviewing -> resolved
2017-01-05 01:50:05maltesetmessages: + msg6010
2017-01-04 23:07:48floriansetmessages: + msg6009
2016-12-22 22:03:30maltesetmessages: + msg5980
2016-12-22 18:37:29maltesetmessages: + msg5978
2016-12-22 18:35:03floriansetmessages: + msg5977
2016-12-22 18:14:06maltesetmessages: + msg5974
2016-09-06 12:11:15floriansetmessages: + msg5609
2016-09-05 19:40:30maltesetmessages: + msg5601
2016-09-05 12:19:32floriansetmessages: + msg5600
2016-09-05 10:52:28jendriksetmessages: + msg5598
2016-08-23 11:30:44floriansetstatus: chatting -> reviewing
messages: + msg5575
2016-08-15 12:27:11floriansetmessages: + msg5549
2016-06-24 11:22:56floriansetstatus: unread -> chatting
messages: + msg5459
2016-01-26 12:02:19martincreate