msg5745 (view) |
Author: jendrik |
Date: 2016-10-17.10:03:09 |
|
Merged and pushed.
Reusing evaluation contexts is issue675.
|
msg5743 (view) |
Author: malte |
Date: 2016-10-14.22:20:32 |
|
Thanks! I had a look at how often the base config outperforms the new config in
terms of the number of expansions on a per-task basis. With the null hypothesis
that they are both equally likely to outperform each other in terms of
expansions, there is no very strong statistical evidence against.
The most extreme result is the one for seed 2, where the advantage of the old
config in terms of expansions would only occur randomly around 5.8% of the times
(comparing only on the ones solved by both). That might indicate that there is a
significant difference, but it is not highly significant.
There is no doubt that it's getting worse in terms of total time, but perhaps
this is something we can look into later on as we eventually streamline the data
structures used for this. For example, we could try to accomplish using the same
data structure for the set of preferred operators throughout the search rather
than creating lots of new ordered sets all the time.
As I said earlier, I think it could move into the evaluation context, and we
could reuse evaluation contexts rather than creating new ones all the time. If
there isn't an issue for that yet, we might want to open one.
I'm fine with merging this.
|
msg5741 (view) |
Author: jendrik |
Date: 2016-10-14.19:05:58 |
|
Sure, here you go:
~/projects/jendrik-downward/experiments/issue535/data/issue535-v7-randomized-eval/properties
|
msg5740 (view) |
Author: malte |
Date: 2016-10-14.18:05:28 |
|
Thanks! If only the last column wasn't there...
Can you let me know where I can find the properties file on maia? (The path in
your home directory suffices, assuming that you haven't disabled read access for
our group.)
|
msg5737 (view) |
Author: jendrik |
Date: 2016-10-14.15:28:54 |
|
As discussed I ran the last configuration again with three different random seeds:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue535-v7-randomized-issue535-base-issue535-v7-compare.html
(The table still uses the old way of aggregating score values.)
|
msg5730 (view) |
Author: jendrik |
Date: 2016-10-13.14:40:19 |
|
Here is a scatter plot showing the number of expansions for the configuration in question:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue535-base-lazy-ff-pref-True-randomize-True-vs-issue535-v7-lazy-ff-pref-True-randomize-True-expansions.png
Here is the same plot, but zoomed in (omitting all points > 10x and < 0.1x):
http://ai.cs.unibas.ch/_tmp_files/seipp/issue535-base-lazy-ff-pref-True-randomize-True-vs-issue535-v7-lazy-ff-pref-True-randomize-True-expansions-censored10x.png
|
msg5729 (view) |
Author: malte |
Date: 2016-10-13.10:38:43 |
|
score_expansions also reduces by a lot in this config. There are three reasons
why it can change:
(1) random noise
(2) lower coverage due to slow-down or similar things
(3) a change in behaviour that leads to more expansions
Regarding (2): this cannot explain the size of the effect, as coverage only
decreases by 5, but score_expansions decreases by 19.03.
Regarding (1): I counted in how many domains score_expansions increases (18) vs.
the number where it decreases (42). The probability of observing such a bad
result randomly are less than 0.15%.
So at the moment I cannot see another explanation than (3), although I also
don't see where or how the behaviour in this case should have changed. Anyone
got any further ideas for getting to the bottom of this?
|
msg5728 (view) |
Author: malte |
Date: 2016-10-13.10:22:46 |
|
Note also that most of the left half of the scatter plot -- everything where
runtimes are below 1 second -- doesn't actually influence score_total_time. The
score_total_time value is identical for all runtimes below 1 second. So the big
increase cannot be explained by what happens on small tasks.
|
msg5727 (view) |
Author: malte |
Date: 2016-10-13.10:11:26 |
|
> Although the total time score decreases, the corresponding total_time scatter
> plot shows a negative trend, i.e., less runtime for long-running tasks.
I don't think so -- I think you may be getting distracted by the (few) extreme
outliers. Also for long runtime, there seem to be many, many tasks where runtime
increases by a factor of 1.1 or so, which is quite massive for something like
shuffling the preferred operators that should be low overhead compared to
heuristic evaluation.
|
msg5726 (view) |
Author: jendrik |
Date: 2016-10-13.10:05:44 |
|
Although the total time score decreases, the corresponding total_time scatter
plot shows a negative trend, i.e., less runtime for long-running tasks.
I have looked at the diff again, focusing on the relevant code path, but
couldn't spot any difference in behavior.
I'd attribute the dispersed results to the randomization.
|
msg5725 (view) |
Author: malte |
Date: 2016-10-13.08:45:00 |
|
The effect in the last column, where total time score reduces by ~50, is rather
severe. Did we change the behaviour there, or can this just be increased
overhead due to the use of OrderedSet?
The effect on total_evaluations etc. for this configuration is also quite
severe, given that the overall behaviour should also be the same (although of
course it is randomized).
The other configurations look acceptable.
|
msg5724 (view) |
Author: jendrik |
Date: 2016-10-13.01:20:22 |
|
Here are the results:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue535-v7-issue535-base-issue535-v7-compare.html
http://ai.cs.unibas.ch/_tmp_files/seipp/issue535-v7-scatter-plots.tar.gz
The tables don't look too good on first sight, but the plots show favorable trends, I think.
|
msg5720 (view) |
Author: malte |
Date: 2016-10-12.18:52:02 |
|
OK, no further comments on Bitbucket. I'm glad we'll finally be able to get rid
of this hack soon.
|
msg5719 (view) |
Author: jendrik |
Date: 2016-10-12.18:42:02 |
|
1) I don't think so.
2) I just started an experiment testing the various flags of lazy search.
|
msg5708 (view) |
Author: malte |
Date: 2016-10-11.13:58:12 |
|
I had another look. I have no further comments beyond what I wrote on the
tracker, but have two questions:
1) Since the last experiments, did we make further changes that warrant a new
round of experiments?
2) Did we test the various flags in lazy search that affect successor generation
(such as preferred first and randomization)?
|
msg5698 (view) |
Author: malte |
Date: 2016-09-30.11:39:18 |
|
I agree on EHC. I'd like to have another look at the code, but I'm not sure how
soon I'll have the opportunity. If you don't hear from me again by Tuesday, some
nagging might speed things up. ;-)
|
msg5697 (view) |
Author: jendrik |
Date: 2016-09-29.17:01:36 |
|
Given that EHC is not one of our main algorithms, the relative changes in runtime
are small and the fact that we shouldn't expect to perform better than using
GlobalOperator::marked, I'd say this is not a deal-breaker.
|
msg5693 (view) |
Author: florian |
Date: 2016-09-29.10:14:13 |
|
> I tested ehc() for all combinations of preferred_usage and use_preferred:
>
> http://ai.cs.unibas.ch/_tmp_files/seipp/issue535-v6-issue535-base-issue535-v6-compare.html
Looks like the effect on time is consistently negative for some domains.
elevators-sat08-strips
openstacks-sat08-adl
openstacks-sat08-strips
optical-telegraphs
parcprinter-08-strips
parcprinter-sat11-strips
philosophers
transport-sat14-strips
The effects are mostly small, though.
|
msg5692 (view) |
Author: florian |
Date: 2016-09-29.09:43:46 |
|
> Also, as I wrote in msg5659, lazy_search wasn't reviewed in detail yet.
I looked at it in my review (msg5664, msg5666).
|
msg5688 (view) |
Author: jendrik |
Date: 2016-09-28.23:23:50 |
|
The following link should show the diff between the last revision you reviewed and the
branch tip:
https://bitbucket.org/jendrikseipp/downward/branches/compare/1e259bb..4c74355#diff
Sorry, for not answering your comments. I liked and made all your proposed changes.
|
msg5686 (view) |
Author: malte |
Date: 2016-09-28.22:11:16 |
|
Is there a good way to see what you changed in the parts of the code that we
commented on, other than by doing a new code review from scratch? With no
answers to the questions and no obvious way to find out how the code was
changed, it is hard to sign off on the changes. (Also, as I wrote in msg5659,
lazy_search wasn't reviewed in detail yet.)
|
msg5681 (view) |
Author: jendrik |
Date: 2016-09-28.18:13:30 |
|
I tested ehc() for all combinations of preferred_usage and use_preferred:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue535-v6-issue535-base-issue535-v6-compare.html
No changes in the number of expanded, evaluated and generated states. Looks fine to me.
I'm also done handling the code comments. Can this be merged?
|
msg5666 (view) |
Author: florian |
Date: 2016-09-27.14:13:30 |
|
I'm done with this round of comments.
|
msg5664 (view) |
Author: florian |
Date: 2016-09-26.11:06:06 |
|
I'll try to have a look at the pull request today.
|
msg5663 (view) |
Author: malte |
Date: 2016-09-24.16:04:00 |
|
OK, I've run that one a few times on my desktop at home, and the differences
between the two versions were negligible. Runtime varied between 788 and 1065
seconds depending on the load of the machine, but there was no noticeable
difference between the base and v5 versions.
|
msg5662 (view) |
Author: malte |
Date: 2016-09-24.14:38:07 |
|
There are a few data points that I would like to find out a bit more about, such
as parking-sat11-strips/pfile10-039.pddl with lazy_greedy_ff, where runtime
increases from 346.97 seconds to 645.32 seconds (no change in number of
evaluated states). If that's just random noise (or rather, related to the load
on maia), there's no problem, but if the code changes actually add 300 seconds
to the runtime, we need to look into them.
|
msg5661 (view) |
Author: jendrik |
Date: 2016-09-24.11:34:23 |
|
Here are the results for an experiment comparing the default branch with the latest issue branch revision:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue535-v5-issue535-base-issue535-v5-compare.html
http://ai.cs.unibas.ch/_tmp_files/seipp/issue535-v5-issue535-base-issue535-v5-total_time-lazy_greedy_ff.png
http://ai.cs.unibas.ch/_tmp_files/seipp/issue535-v5-issue535-base-issue535-v5-total_time-eager_greedy_ff.png
http://ai.cs.unibas.ch/_tmp_files/seipp/issue535-v5-issue535-base-issue535-v5-total_time-ehc_ff.png
http://ai.cs.unibas.ch/_tmp_files/seipp/issue535-v5-issue535-base-issue535-v5-total_time-lama-first.png
Overall the total_time score goes down a little bit, but I don't think this is a dealbreaker.
|
msg5659 (view) |
Author: malte |
Date: 2016-09-23.18:36:33 |
|
I made some comments. I skipped enforced hill-climbing and would appreciate if
someone else could check it, and I mostly skipped lazy search because there is a
more general comment to address there first. The other files look fine to me
apart from the comments I made (the collection of preferred operators might be
much slower than before because of the "one-by-one" implementation we have now,
but let's wait for the experiments before rushing to change that), but perhaps
it would be good for someone else to have another look.
|
msg5658 (view) |
Author: jendrik |
Date: 2016-09-23.17:57:05 |
|
I have updated the pull request to reflect or offline discussion. Can one of you
have a look again, please?
|
msg5655 (view) |
Author: florian |
Date: 2016-09-20.22:04:06 |
|
I'm done with my comments.
> Related to our recent discussion of directories, I think this belongs in the
> "algorithms" subdirectory, not "utils". Right?
It would be great, if we had some kind of documentation, what goes where. ;-)
|
msg5650 (view) |
Author: malte |
Date: 2016-09-19.17:42:16 |
|
I made a few more comments, but did not look at everything. If Florian does a
round of reviewing, this could speed things up.
Related to our recent discussion of directories, I think this belongs in the
"algorithms" subdirectory, not "utils". Right?
|
msg5647 (view) |
Author: jendrik |
Date: 2016-09-19.15:51:52 |
|
Thanks again for the comments Malte.
The code now uses OrderedSet for lazy search and EHC as well. Retrieving the
individual ordered and unordered collections from the OrderedSet makes for a
little cumbersome code, but it avoids copying the data.
While I was at it, I used OrderedDict in eager search as well.
Malte, could you have another look at the code? Or should I ask Florian?
|
msg5635 (view) |
Author: jendrik |
Date: 2016-09-16.22:42:18 |
|
Florian and I agreed that this issue should be resolved before we continue the discussion about
whether to keep GlobalOperator (see issue509).
I have made a pull request at https://bitbucket.org/jendrikseipp/downward/pull-requests/58
Could you have a look at the code please, Florian?
I have already run some experiments for ["--heuristic", "h=ff()", "--search", "lazy_greedy(h,
preferred=h)"]. It seems the runtime is dominated by the heuristic computation (which is probably
true for most heuristics marking preferred operators), so I'd say we wait with running further
experiments until we're all fine with the code.
|
msg4235 (view) |
Author: jendrik |
Date: 2015-06-03.16:48:29 |
|
The flag is currently used for marking preferred operators. It should be removed to allow for
cleaner code.
|
|
Date |
User |
Action |
Args |
2016-10-17 10:03:09 | jendrik | set | status: reviewing -> resolved messages:
+ msg5745 |
2016-10-14 22:20:32 | malte | set | messages:
+ msg5743 |
2016-10-14 19:05:58 | jendrik | set | messages:
+ msg5741 |
2016-10-14 18:05:28 | malte | set | messages:
+ msg5740 |
2016-10-14 15:28:54 | jendrik | set | messages:
+ msg5737 |
2016-10-13 14:40:19 | jendrik | set | messages:
+ msg5730 |
2016-10-13 10:38:43 | malte | set | messages:
+ msg5729 |
2016-10-13 10:22:46 | malte | set | messages:
+ msg5728 |
2016-10-13 10:11:26 | malte | set | messages:
+ msg5727 |
2016-10-13 10:05:44 | jendrik | set | messages:
+ msg5726 |
2016-10-13 08:45:00 | malte | set | messages:
+ msg5725 |
2016-10-13 01:20:22 | jendrik | set | messages:
+ msg5724 |
2016-10-12 18:52:02 | malte | set | messages:
+ msg5720 |
2016-10-12 18:42:02 | jendrik | set | messages:
+ msg5719 |
2016-10-11 13:58:12 | malte | set | messages:
+ msg5708 |
2016-09-30 11:39:18 | malte | set | messages:
+ msg5698 |
2016-09-29 17:01:36 | jendrik | set | messages:
+ msg5697 |
2016-09-29 10:14:13 | florian | set | messages:
+ msg5693 |
2016-09-29 09:43:46 | florian | set | messages:
+ msg5692 |
2016-09-28 23:23:50 | jendrik | set | messages:
+ msg5688 |
2016-09-28 22:11:16 | malte | set | messages:
+ msg5686 |
2016-09-28 18:13:30 | jendrik | set | messages:
+ msg5681 |
2016-09-27 14:13:30 | florian | set | messages:
+ msg5666 |
2016-09-26 11:06:06 | florian | set | messages:
+ msg5664 |
2016-09-24 16:04:00 | malte | set | messages:
+ msg5663 |
2016-09-24 14:38:08 | malte | set | messages:
+ msg5662 |
2016-09-24 11:34:24 | jendrik | set | messages:
+ msg5661 |
2016-09-23 18:36:33 | malte | set | messages:
+ msg5659 |
2016-09-23 17:57:05 | jendrik | set | messages:
+ msg5658 |
2016-09-20 22:04:06 | florian | set | messages:
+ msg5655 |
2016-09-19 17:42:16 | malte | set | messages:
+ msg5650 |
2016-09-19 15:51:52 | jendrik | set | status: chatting -> reviewing messages:
+ msg5647 |
2016-09-16 22:42:18 | jendrik | set | status: unread -> chatting nosy:
+ florian messages:
+ msg5635 |
2016-09-16 11:40:35 | jendrik | set | assignedto: jendrik |
2015-06-03 16:48:29 | jendrik | create | |