Issue774

Title Get rid of g_axiom_evaluator, g_state_packer, and g_initial_state_data
Priority wish Status resolved
Superseder Nosy List florian, guillem, jendrik, malte
Assigned To florian Keywords
Optional summary
This is part of issue509. It would include the change suggested in issue699.

Created on 2018-04-08.15:12:17 by florian, last changed by florian.

Summary
This is part of issue509. It would include the change suggested in issue699.
Messages
msg7452 (view) Author: florian Date: 2018-09-14.10:20:24
Great! I merged and pushed the changes.
msg7451 (view) Author: malte Date: 2018-09-14.10:14:28
I really like the last changes! :-) Feel free to merge.
msg7450 (view) Author: malte Date: 2018-09-14.09:28:01
Looks good! Perhaps let me have a brief look at the last commits before we merge.
msg7448 (view) Author: florian Date: 2018-09-13.23:18:52
Thanks. I handled your comments and updated the pull request.

The experiments are also done. I got lots of unexplained errors which are caused
by an old lab version that doesn't understand the new exit codes yet. There are
also a couple of the usual "cg(null)" lines in the output but otherwise
everything looks ok. 

The coverage of lama-first decreases by 6 tasks, although I suspect 2 of them
are duplicates (transport 2011/2014 and scanalyzer 2008/2011). I looked at the
runtime and memory distribution of other tasks in those domains and only saw a
very small constant added to the memory usage as we often do when the code changes. 

In the other configurations, coverage stays the same and the other metrics all
look reasonably close.

The full report is here:
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue774-v2-sat-issue774-v2-base-issue774-v2-compare.html

From my perspective, this is ready to be merged.
msg7445 (view) Author: malte Date: 2018-09-13.19:32:47
I'm done with my comments on bitbucket.
msg7437 (view) Author: florian Date: 2018-09-13.15:01:17
Thanks for the review. I updated the pull request and also merged the current
default branch again. I started repeating the experiment for the new revisions
but in the meantime, this is ready for another round of review.
msg7415 (view) Author: malte Date: 2018-09-12.19:31:13
I had a look at the code, but only really from a high-level perspective, and
from that perspective it looks good. Someone else should check the details, though.

Regarding how much axiom evaluators etc. are part of a task, I guess it depends
on which way you look at it. The argument against having them as an integral
part is that our "AxiomEvaluator" is just one particular way of implementing the
evaluation of the derived values. At some level, I think it's just as much or
just as little a part of the task as a successor generator is.

On the awkwardness of where/how/when to compute the axiom data for the initial
state, I agree that this still requires further thought, although perhaps we can
get away with not considering as part of this issue. Not sure, I didn't look at
this bit closely enough because it needs much more time and focus than I have
right now. Something we could discuss tomorrow. One long-term option is not
evaluating them at all the way we currently do. It's not actually clear whether
it's worth always evaluating the derived variables anyway. One could argue that
this data is not a core part of the state and for example should not be stored
in the state registry. (For example, in PSR, I'm sure the derived variables
waste lots and lots of memory.) One can imagine a world where derived variables
are only computed on demand, or optionally stored in a PerStateInformation.
(Because they are always binary, it could be a PerStateBitset.) I'm not
advocating any particular action for the moment, just thinking about possible
directions we can go later on to get rid of the current awkwardness there.

Regarding state packers etc.: I always think of them as something that belongs
to our particular storage mechanism, not to the *task*. In the long term, the
whole StateRegistry mechanism we use could be an optional part of the planner.
For example, I would like to eventually be able to do IDA* search without
duplicate elimination without the StateRegistry mechanism getting in the way. We
are a long way away from that, but to eventually get there, the "abstract task"
component of the planner should not depend on the "state registry" component of
the planner, whereas the opposite dependency is fine, because the abstract tasks
are clearly a core part of the planner.
msg7382 (view) Author: florian Date: 2018-09-11.22:16:50
We discussed this issue today and looked at the broader picture of supporting
heavy objects associated with tasks in general. Some of these objects (e.g., the
causal graph, landmarks) are optional, so we wanted to come up with a solution
that works for them and then see if that solution also is useful for the data in
this issue (axiom evaluator, state packer, initial state data) which is more
closely linked to the task.

Since we want to keep the optional objects modular, we cannot implement getters
for them in AbstractTask. We ended up with a global registry
(PerTaskInformation) to store the objects. This is similar to what we had at the
start of this issue but with an added subscriber mechanism that deletes data for
a task when that task is destroyed.

This can also be used for the axiom evaluator and the state packer but it would
be a bit awkward for the initial state data, I think. In fact, since the axiom
evaluator is needed to compute the correct values of the initial state, I also
find it awkward not to have it in the task. All three objects seem so closely
linked to the task that I'm not sure we should treat them the same as for
example the causal graph. For example, the axiom evaluator is still used in the
root task's constructor, and all three objects are required to create any
registered states in the task. What do you think?

I pushed my implementation with PerTaskInformation for axiom evaluator and state
packer to the pull request. Here is the link again:
https://bitbucket.org/FlorianPommerening/downward-issues/pull-requests/42

Maybe this only seem awkward to me because I made the implementation more
complex than necessary. Could you have another look?
msg7288 (view) Author: florian Date: 2018-07-06.19:31:13
There are two types of cases where I used mutable:

1) For data that is used during a computation of a method of the class and is
only a member for efficiency reasons:
  AxiomRule::unsatisfied_conditions
  AxiomEvaluator::queue

2) For objects in the task that are unique pointers to objects created on-demand:
  DomainAbstractedTask::state_packer
  RootTask::state_packer
  RootTask::axiom_evaluator


I think we talked about wrapping the second case in a templated class. Shoudl I
give this a try? I don't know what we can do about the first case (or if we should).
msg7287 (view) Author: malte Date: 2018-07-06.18:41:05
There are may new uses of "mutable" in the pull request, not just one particular
use in the axiom evaluator. The strategy used when it's OK to violate const-ness
and when it isn't wasn't clear to me from the diff. What we want to avoid is a
new coding convention where "const" has no meaning.
msg7286 (view) Author: florian Date: 2018-07-06.17:56:14
We didn't find time to discuss this code in detail but we talked about in in
general terms on the way to the train station. I remember you not liking using
mutable in the axiom evaluator, since caching with mutables is a common source
of errors. I can't remember if we talked about a solution. If the axiom
evaluator is not const, we cannot get it from a const RootTask which is problematic.

I would like to make some progress with this issue because issue791, issue792,
issue793, and issue794 depend on it. Do you have a suggestion for a next step?
msg7205 (view) Author: malte Date: 2018-06-05.19:15:04
I had a look at the code, but it may be better to discuss this live than on
bitbucket. Should we try to find some time for this tomorrow?
msg7186 (view) Author: jendrik Date: 2018-06-05.07:59:08
Good point.
msg7185 (view) Author: florian Date: 2018-06-05.07:55:11
Getting rid of the code duplication is issue794. I don't want to delay this
issue, especially since Malte already looked at the results.

In issue794 the code duplication is removed by a common interface to state data
given as a vector and given in packed representation, so we still evaluate
axioms on the packed representation. If you think it is better to unpack the
states for evaluation, we should move the discussion there.
msg7182 (view) Author: jendrik Date: 2018-06-04.22:19:05
The patch duplicates the axiom evaluation code for States and GlobalStates. Have 
we ever evaluated the performance impact of unpacking global states before 
evaluating axioms? If the impact is small, we might only need one variant of the 
code.
msg7025 (view) Author: malte Date: 2018-04-10.11:51:39
I think the experimental results look OK.
msg7024 (view) Author: jendrik Date: 2018-04-10.10:12:55
I pushed the temporary fix to the default branch. This will be polished further 
in issue739 and when me make the VAL limits customizable on the commandline.
msg7023 (view) Author: jendrik Date: 2018-04-10.10:01:46
1) I think so, we have seen similar effects in other experiments using lama-
first.
2) The setup looks fine: you're using overall memory and time limits. I think 
the problem is VAL for which we currently don't enforce any memory limit. I'll 
cherrypick the patch for this from issue739, even though we originally decided 
against doing this change in "default" earlier.
3) I think this is again caused by VAL and its missing limits.
msg7022 (view) Author: florian Date: 2018-04-10.01:31:10
I ran experiments for this and overall, they look ok to me:

http://ai.cs.unibas.ch/_tmp_files/pommeren/issue774-v1-sat-issue774-base-issue774-v1-compare.html
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue774-blind-total_time-base-v1.png
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue774-lama-first-total_time-base-v1.png

There were a couple of strange effects, though:

1) There are two scanalyzer tasks (maybe the same in two domains) that have a
higher cost in the new code using lama-first. Is this configuration
non-deterministic?
2) A couple of tasks reached the slurm memory limit. Jendrik, did I set up the
experiment in the wrong way?
3) At least one task (runs-06801-06900/06805) exceeded our limits a lot: it had
a wall clock time of 2836.70s and a memory usage of 2601432 KB. Both should not
be possible. There were other tasks with high memory usages but this one was
extreme. But this also seems to be more an issue with the experiment than with
the code.
msg7020 (view) Author: jendrik Date: 2018-04-09.10:15:39
Florian and I discussed my comments and now we're both happy with the code 
(without any code changes :-)).
msg7019 (view) Author: jendrik Date: 2018-04-08.19:39:18
I'm done with the first round of comments. I like that the code nicely cleans up 
a lot of things (although it has to introduce other minor workarounds).
msg7018 (view) Author: jendrik Date: 2018-04-08.17:39:14
I'll have a look tomorrow.
msg7017 (view) Author: florian Date: 2018-04-08.17:27:26
I updated the pull request. No devils encountered so far.
msg7016 (view) Author: malte Date: 2018-04-08.16:12:26
First impression: sounds very good. (The devil may be in the details.)
msg7015 (view) Author: florian Date: 2018-04-08.16:08:18
You are right, this patch doesn't help much. Maybe we can make progress here, by
discussing where the objects should live eventually. Thinking of tasks with
limited lifetime, it sounds like the objects should live inside the
AbstractTask, so they will be deleted once the task is destroyed. Would it be an
option to add them to the task interface and add methods to the task proxy to
access them?
msg7014 (view) Author: malte Date: 2018-04-08.15:56:11
Understood. But looking at the code, it feels like it's trying to sweep global
data under the rug without really changing anything about the way the data is
organized except that we can now support multiple global tasks (but only
creating new ones, never removing one.)

I don't like that the code removes the warning signs for global data that we
intentionally put in to help us figure out which global data exists. I don't
think we should rename it from "g_axiom_evaluator" to "axiom_evaluator_cache",
when it is not really a cache (the data in it lives forever) and it is just as
global as it was before. I would prefer a name like "g_axiom_evaluators" to make
this clear, and ditto for the others.

The reason why we want to get rid of globals is because we want to make it
possible to create a task with a limited lifetime, do something with it, then
dispose of it. I think that at least for the moment we are making things worse
towards this goal, because it becomes harder to figure out what the globals are,
and now using an otherwise disposable task objects subtly pollutes the global
data as a side effect of innocuous-looking accessor functions, with no chance of
cleanup.

For things like the causal graphs, we lived with that for the time being because
we had to support them for arbitrary tasks or else wouldn't have been able to
continue with the task transformations, so we begrudgingly accepted the less
than ideal solution. But it should be clear that it's not in any way a solution
we want to promote to a general design pattern.
msg7012 (view) Author: florian Date: 2018-04-08.15:37:11
@Malte: one advantage I see is that the global data is not tied to a specific
task anymore.

We started a discussion a while ago about how and where we wanted to store the
heavy-weight objects for each task (successor generator, state registry, axiom
evaluator, state packer, causal graph, etc.). As far as I remember, we didn't
find a solution yet. This patch would use the same temporary solution that the
causal graph already uses. So my hope would be that we can unify the solutions
for the different classes more easily after this issue.
msg7011 (view) Author: florian Date: 2018-04-08.15:29:01
Here is the link to the patch:

https://bitbucket.org/FlorianPommerening/downward-issues/pull-requests/42

Jendrik, could you do an initial review?
msg7010 (view) Author: malte Date: 2018-04-08.15:20:56
From the description, it sounds like the change proliferates global data (by
introducing a global hash map that can keep alive a lot more data than just what
we currently have related to the root task) rather than reduce it.

I don't necessarily mind the change, but I would like to understand the
motivation and the eventual direction in which this should evolve better. Using
global data should be a solution of last resort.
msg7009 (view) Author: florian Date: 2018-04-08.15:12:17
I prepared a suggestion how we could get rid of the global variables related to
state creation. To keep this issue small, I would like to exclude the TODOs in
StateRegistry and the code duplication in AxiomEvaluator.

The global objects mentioned in the title currently are created for the root
task and stored in global variables. The patch would change this so the objects
are created like the causal graph: there is a static hash map from task pointers
 to created objects so each object is created at most once. The next steps would
be to unify this and to think about how and when objects created for a task can
be used for transformed tasks. But I suggest to handle this in a follow-up issue.
History
Date User Action Args
2018-09-14 10:20:24floriansetstatus: chatting -> resolved
messages: + msg7452
2018-09-14 10:14:28maltesetmessages: + msg7451
2018-09-14 09:28:01maltesetmessages: + msg7450
2018-09-13 23:18:52floriansetmessages: + msg7448
2018-09-13 19:32:47maltesetmessages: + msg7445
2018-09-13 15:01:17floriansetmessages: + msg7437
2018-09-12 19:31:13maltesetmessages: + msg7415
2018-09-11 22:16:50floriansetnosy: + guillem
messages: + msg7382
2018-07-06 19:31:13floriansetmessages: + msg7288
2018-07-06 18:41:05maltesetmessages: + msg7287
2018-07-06 17:56:14floriansetmessages: + msg7286
2018-06-05 19:15:04maltesetmessages: + msg7205
2018-06-05 07:59:08jendriksetmessages: + msg7186
2018-06-05 07:55:11floriansetmessages: + msg7185
2018-06-04 22:19:05jendriksetmessages: + msg7182
2018-04-10 11:51:40maltesetmessages: + msg7025
2018-04-10 10:12:55jendriksetmessages: + msg7024
2018-04-10 10:01:46jendriksetmessages: + msg7023
2018-04-10 01:31:10floriansetmessages: + msg7022
2018-04-09 10:15:39jendriksetmessages: + msg7020
2018-04-08 19:39:18jendriksetmessages: + msg7019
2018-04-08 17:39:14jendriksetmessages: + msg7018
2018-04-08 17:27:26floriansetmessages: + msg7017
2018-04-08 16:12:26maltesetmessages: + msg7016
2018-04-08 16:08:18floriansetmessages: + msg7015
2018-04-08 15:56:11maltesetmessages: + msg7014
2018-04-08 15:37:11floriansetmessages: + msg7012
2018-04-08 15:29:01floriansetmessages: + msg7011
2018-04-08 15:20:56maltesetstatus: unread -> chatting
messages: + msg7010
2018-04-08 15:12:17floriancreate