Title remove GlobalOperator::marked
Priority feature Status resolved
Superseder Nosy List florian, jendrik, malte
Assigned To jendrik Keywords
Optional summary

Created on 2015-06-03.16:48:29 by jendrik, last changed by jendrik.

msg5745 (view) Author: jendrik Date: 2016-10-17.10:03:09
Merged and pushed. 

Reusing evaluation contexts is issue675.
msg5743 (view) Author: malte Date: 2016-10-14.22:20:32
Thanks! I had a look at how often the base config outperforms the new config in
terms of the number of expansions on a per-task basis. With the null hypothesis
that they are both equally likely to outperform each other in terms of
expansions, there is no very strong statistical evidence against.

The most extreme result is the one for seed 2, where the advantage of the old
config in terms of expansions would only occur randomly around 5.8% of the times
(comparing only on the ones solved by both). That might indicate that there is a
significant difference, but it is not highly significant.

There is no doubt that it's getting worse in terms of total time, but perhaps
this is something we can look into later on as we eventually streamline the data
structures used for this. For example, we could try to accomplish using the same
data structure for the set of preferred operators throughout the search rather
than creating lots of new ordered sets all the time.

As I said earlier, I think it could move into the evaluation context, and we
could reuse evaluation contexts rather than creating new ones all the time. If
there isn't an issue for that yet, we might want to open one.

I'm fine with merging this.
msg5741 (view) Author: jendrik Date: 2016-10-14.19:05:58
Sure, here you go:

msg5740 (view) Author: malte Date: 2016-10-14.18:05:28
Thanks! If only the last column wasn't there...

Can you let me know where I can find the properties file on maia? (The path in
your home directory suffices, assuming that you haven't disabled read access for
our group.)
msg5737 (view) Author: jendrik Date: 2016-10-14.15:28:54
As discussed I ran the last configuration again with three different random seeds:

(The table still uses the old way of aggregating score values.)
msg5730 (view) Author: jendrik Date: 2016-10-13.14:40:19
Here is a scatter plot showing the number of expansions for the configuration in question:

Here is the same plot, but zoomed in (omitting all points > 10x and < 0.1x):
msg5729 (view) Author: malte Date: 2016-10-13.10:38:43
score_expansions also reduces by a lot in this config. There are three reasons
why it can change:

(1) random noise
(2) lower coverage due to slow-down or similar things
(3) a change in behaviour that leads to more expansions

Regarding (2): this cannot explain the size of the effect, as coverage only
decreases by 5, but score_expansions decreases by 19.03.

Regarding (1): I counted in how many domains score_expansions increases (18) vs.
the number where it decreases (42). The probability of observing such a bad
result randomly are less than 0.15%.

So at the moment I cannot see another explanation than (3), although I also
don't see where or how the behaviour in this case should have changed. Anyone
got any further ideas for getting to the bottom of this?
msg5728 (view) Author: malte Date: 2016-10-13.10:22:46
Note also that most of the left half of the scatter plot -- everything where
runtimes are below 1 second -- doesn't actually influence score_total_time. The
score_total_time value is identical for all runtimes below 1 second. So the big
increase cannot be explained by what happens on small tasks.
msg5727 (view) Author: malte Date: 2016-10-13.10:11:26
> Although the total time score decreases, the corresponding total_time scatter
> plot shows a negative trend, i.e., less runtime for long-running tasks.

I don't think so -- I think you may be getting distracted by the (few) extreme
outliers. Also for long runtime, there seem to be many, many tasks where runtime
increases by a factor of 1.1 or so, which is quite massive for something like
shuffling the preferred operators that should be low overhead compared to
heuristic evaluation.
msg5726 (view) Author: jendrik Date: 2016-10-13.10:05:44
Although the total time score decreases, the corresponding total_time scatter 
plot shows a negative trend, i.e., less runtime for long-running tasks.

I have looked at the diff again, focusing on the relevant code path, but 
couldn't spot any difference in behavior.

I'd attribute the dispersed results to the randomization.
msg5725 (view) Author: malte Date: 2016-10-13.08:45:00
The effect in the last column, where total time score reduces by ~50, is rather
severe. Did we change the behaviour there, or can this just be increased
overhead due to the use of OrderedSet?

The effect on total_evaluations etc. for this configuration is also quite
severe, given that the overall behaviour should also be the same (although of
course it is randomized).

The other configurations look acceptable.
msg5724 (view) Author: jendrik Date: 2016-10-13.01:20:22
Here are the results:

The tables don't look too good on first sight, but the plots show favorable trends, I think.
msg5720 (view) Author: malte Date: 2016-10-12.18:52:02
OK, no further comments on Bitbucket. I'm glad we'll finally be able to get rid
of this hack soon.
msg5719 (view) Author: jendrik Date: 2016-10-12.18:42:02
1) I don't think so.

2) I just started an experiment testing the various flags of lazy search.
msg5708 (view) Author: malte Date: 2016-10-11.13:58:12
I had another look. I have no further comments beyond what I wrote on the
tracker, but have two questions:

1) Since the last experiments, did we make further changes that warrant a new
round of experiments?

2) Did we test the various flags in lazy search that affect successor generation
(such as preferred first and randomization)?
msg5698 (view) Author: malte Date: 2016-09-30.11:39:18
I agree on EHC. I'd like to have another look at the code, but I'm not sure how
soon I'll have the opportunity. If you don't hear from me again by Tuesday, some
nagging might speed things up. ;-)
msg5697 (view) Author: jendrik Date: 2016-09-29.17:01:36
Given that EHC is not one of our main algorithms, the relative changes in runtime 
are small and the fact that we shouldn't expect to perform better than using 
GlobalOperator::marked, I'd say this is not a deal-breaker.
msg5693 (view) Author: florian Date: 2016-09-29.10:14:13
> I tested ehc() for all combinations of preferred_usage and use_preferred:

Looks like the effect on time is consistently negative for some domains.
The effects are mostly small, though.
msg5692 (view) Author: florian Date: 2016-09-29.09:43:46
> Also, as I wrote in msg5659, lazy_search wasn't reviewed in detail yet.
I looked at it in my review (msg5664, msg5666).
msg5688 (view) Author: jendrik Date: 2016-09-28.23:23:50
The following link should show the diff between the last revision you reviewed and the 
branch tip:

Sorry, for not answering your comments. I liked and made all your proposed changes.
msg5686 (view) Author: malte Date: 2016-09-28.22:11:16
Is there a good way to see what you changed in the parts of the code that we
commented on, other than by doing a new code review from scratch? With no
answers to the questions and no obvious way to find out how the code was
changed, it is hard to sign off on the changes. (Also, as I wrote in msg5659,
lazy_search wasn't reviewed in detail yet.)
msg5681 (view) Author: jendrik Date: 2016-09-28.18:13:30
I tested ehc() for all combinations of preferred_usage and use_preferred:

No changes in the number of expanded, evaluated and generated states. Looks fine to me.

I'm also done handling the code comments. Can this be merged?
msg5666 (view) Author: florian Date: 2016-09-27.14:13:30
I'm done with this round of comments.
msg5664 (view) Author: florian Date: 2016-09-26.11:06:06
I'll try to have a look at the pull request today.
msg5663 (view) Author: malte Date: 2016-09-24.16:04:00
OK, I've run that one a few times on my desktop at home, and the differences
between the two versions were negligible. Runtime varied between 788 and 1065
seconds depending on the load of the machine, but there was no noticeable
difference between the base and v5 versions.
msg5662 (view) Author: malte Date: 2016-09-24.14:38:07
There are a few data points that I would like to find out a bit more about, such
as parking-sat11-strips/pfile10-039.pddl with lazy_greedy_ff, where runtime
increases from 346.97 seconds to 645.32 seconds (no change in number of
evaluated states). If that's just random noise (or rather, related to the load
on maia), there's no problem, but if the code changes actually add 300 seconds
to the runtime, we need to look into them.
msg5661 (view) Author: jendrik Date: 2016-09-24.11:34:23
Here are the results for an experiment comparing the default branch with the latest issue branch revision:

Overall the total_time score goes down a little bit, but I don't think this is a dealbreaker.
msg5659 (view) Author: malte Date: 2016-09-23.18:36:33
I made some comments. I skipped enforced hill-climbing and would appreciate if
someone else could check it, and I mostly skipped lazy search because there is a
more general comment to address there first. The other files look fine to me
apart from the comments I made (the collection of preferred operators might be
much slower than before because of the "one-by-one" implementation we have now,
but let's wait for the experiments before rushing to change that), but perhaps
it would be good for someone else to have another look.
msg5658 (view) Author: jendrik Date: 2016-09-23.17:57:05
I have updated the pull request to reflect or offline discussion. Can one of you 
have a look again, please?
msg5655 (view) Author: florian Date: 2016-09-20.22:04:06
I'm done with my comments.

> Related to our recent discussion of directories, I think this belongs in the
> "algorithms" subdirectory, not "utils". Right?
It would be great, if we had some kind of documentation, what goes where. ;-)
msg5650 (view) Author: malte Date: 2016-09-19.17:42:16
I made a few more comments, but did not look at everything. If Florian does a
round of reviewing, this could speed things up.

Related to our recent discussion of directories, I think this belongs in the
"algorithms" subdirectory, not "utils". Right?
msg5647 (view) Author: jendrik Date: 2016-09-19.15:51:52
Thanks again for the comments Malte. 

The code now uses OrderedSet for lazy search and EHC as well. Retrieving the 
individual ordered and unordered collections from the OrderedSet makes for a 
little cumbersome code, but it avoids copying the data.

While I was at it, I used OrderedDict in eager search as well.

Malte, could you have another look at the code? Or should I ask Florian?
msg5635 (view) Author: jendrik Date: 2016-09-16.22:42:18
Florian and I agreed that this issue should be resolved before we continue the discussion about 
whether to keep GlobalOperator (see issue509).

I have made a pull request at

Could you have a look at the code please, Florian?

I have already run some experiments for ["--heuristic", "h=ff()", "--search", "lazy_greedy(h, 
preferred=h)"]. It seems the runtime is dominated by the heuristic computation (which is probably 
true for most heuristics marking preferred operators), so I'd say we wait with running further 
experiments until we're all fine with the code.
msg4235 (view) Author: jendrik Date: 2015-06-03.16:48:29
The flag is currently used for marking preferred operators. It should be removed to allow for 
cleaner code.
Date User Action Args
2016-10-17 10:03:09jendriksetstatus: reviewing -> resolved
messages: + msg5745
2016-10-14 22:20:32maltesetmessages: + msg5743
2016-10-14 19:05:58jendriksetmessages: + msg5741
2016-10-14 18:05:28maltesetmessages: + msg5740
2016-10-14 15:28:54jendriksetmessages: + msg5737
2016-10-13 14:40:19jendriksetmessages: + msg5730
2016-10-13 10:38:43maltesetmessages: + msg5729
2016-10-13 10:22:46maltesetmessages: + msg5728
2016-10-13 10:11:26maltesetmessages: + msg5727
2016-10-13 10:05:44jendriksetmessages: + msg5726
2016-10-13 08:45:00maltesetmessages: + msg5725
2016-10-13 01:20:22jendriksetmessages: + msg5724
2016-10-12 18:52:02maltesetmessages: + msg5720
2016-10-12 18:42:02jendriksetmessages: + msg5719
2016-10-11 13:58:12maltesetmessages: + msg5708
2016-09-30 11:39:18maltesetmessages: + msg5698
2016-09-29 17:01:36jendriksetmessages: + msg5697
2016-09-29 10:14:13floriansetmessages: + msg5693
2016-09-29 09:43:46floriansetmessages: + msg5692
2016-09-28 23:23:50jendriksetmessages: + msg5688
2016-09-28 22:11:16maltesetmessages: + msg5686
2016-09-28 18:13:30jendriksetmessages: + msg5681
2016-09-27 14:13:30floriansetmessages: + msg5666
2016-09-26 11:06:06floriansetmessages: + msg5664
2016-09-24 16:04:00maltesetmessages: + msg5663
2016-09-24 14:38:08maltesetmessages: + msg5662
2016-09-24 11:34:24jendriksetmessages: + msg5661
2016-09-23 18:36:33maltesetmessages: + msg5659
2016-09-23 17:57:05jendriksetmessages: + msg5658
2016-09-20 22:04:06floriansetmessages: + msg5655
2016-09-19 17:42:16maltesetmessages: + msg5650
2016-09-19 15:51:52jendriksetstatus: chatting -> reviewing
messages: + msg5647
2016-09-16 22:42:18jendriksetstatus: unread -> chatting
nosy: + florian
messages: + msg5635
2016-09-16 11:40:35jendriksetassignedto: jendrik
2015-06-03 16:48:29jendrikcreate