msg11153 (view) |
Author: malte |
Date: 2023-07-21.14:43:52 |
|
[We looked at this as part of a triaging round. Opinions below agreed at least by the group that was present. ;-)]
Marking this as resolved because it's a wish issue with no specific implementation plan and there is nothing in the discussion that we don't know well.
Specifically, we know that our invariant synthesis algorithm is unsuitable for grounded tasks and not a great invariant synthesis algorithm in general. The solution for this would be to implement something better; I don't think we should expend the effort trying to doctor with the existing algorithm much more to make it more reproducible in the cases where it does poorly. (There would be some value in this, but I think it isn't good value for effort.)
The real solution would be to implement something like Jussi's lifted invariant synthesis algorithm, perhaps with some extra stuff to deal with language fragments he doesn't cover. If someone is interested in actively working on this, this would be worth adding a new issue, but we generally don't create these without a concrete intention to follow through.
|
msg2757 (view) |
Author: jendrik |
Date: 2013-12-03.17:43:06 |
|
I removed pathways from suite_satisficing_with_ipc11 and suite_all.
|
msg2756 (view) |
Author: malte |
Date: 2013-12-03.13:39:04 |
|
OK, we should make sure not to use this in our papers then. Maybe remove
pathways from the suite? (pathways-noneg is a semantically identical
reformulation with negative preconditions compiled away)
|
msg2755 (view) |
Author: jendrik |
Date: 2013-12-03.13:27:16 |
|
ALL includes both pathways and pathways-noneg.
|
msg2754 (view) |
Author: malte |
Date: 2013-12-03.12:56:29 |
|
Thanks! It's the custom that if a domain has multiple alternative formulations
of alternative problems, planners can pick their formulation. That's why ALL
only includes one formulation per domain. (Does it include pathways +
pathways-neg, for example? I'm not sure if we'd want that.)
If I didn't overlook anything, it seems there are no further problematic cases.
|
msg2753 (view) |
Author: jendrik |
Date: 2013-12-03.12:05:39 |
|
I didn't know that the suite ALL does not include airport-adl, no-mprime and no-
mystery. These are included in the attached results now. Also I fixed a markup
error for the empty lists in assembly.
|
msg2744 (view) |
Author: gabi |
Date: 2013-12-02.17:44:53 |
|
Using the action name to get the heavy action is indeed a bug. I split this up
as issue399.
|
msg2741 (view) |
Author: malte |
Date: 2013-12-02.14:27:35 |
|
> Also, while looking at the source code of BalanceChecker.get_heavy_action, it
> seems to rely on the assumption that action names are unique, which I think
> isn't necessarily true after normalization. What do you think?
@Gabi: what are your thoughts on this?
|
msg2740 (view) |
Author: malte |
Date: 2013-12-02.14:25:32 |
|
Observations:
* It seems we only have data for our "regular" domains? (See msg2719.)
* There seems to be something wrong in the parser used to generate the
"taskwise-invariant-attributes" report. The empty lists as in Assembly look odd.
* There are two domains where runtimes are very large: openstacks-strips and
trucks-strips. Interestingly, we never seem to time out. Instead, we hit the
MAX_CANDIDATES limit of 100000. This is very different from what we see in
msg2717 and suggests that there is indeed a severe speed degradation with the
version tested in that message.
* Apart from the two critical domains and a third one, psr-small, we never look
at more than 1000 candidates. In psr-small, where we get up to 2574 candidates.
* Also, apart from these three domains (openstacks-strips, trucks-strips,
psr-small), the maximal invariant size before grounding is 4, whereas in these
domains we find invariants of size up to 24, and probably would find even larger
ones if we didn't cut the generation short after a certain time/number of
candidates.
* These three domains are the only ones that are fully grounded. At least that's
what 'grep -L '?' benchmarks/*/*dom*' suggests.
In conclusion, it looks like our invariant synthesis algorithm is not that
suitable for problems that are fully grounded. This isn't a surprise, but it's
interesting to see this brought out so nicely in the data.
There are different ways in which this could be addressed. If we want to
preserve the old performance *and* not lose any invariants *and* be
deterministic, I think the only real solution is to detect grounded problems and
use a more suitable algorithm for them. Of course this would be a lot of work. I
suggest we discuss this further offline.
|
msg2738 (view) |
Author: malte |
Date: 2013-12-01.12:27:44 |
|
Interesting, looks like we don't find any invariants in Schedule. Shouldn't we?
I thought we used to in the past.
|
msg2737 (view) |
Author: jendrik |
Date: 2013-12-01.11:37:27 |
|
Here are the changes I made to log the requested values, please check that I
made the correct measurements:
https://bitbucket.org/jendrikseipp/downward/commits/94d34322b477ca5cf90943ba22c2
8899765133f8?at=default
The results are attached.
|
msg2736 (view) |
Author: malte |
Date: 2013-11-30.18:06:17 |
|
Yes, please.
I just checked a code: it's a BFS, as it's using a deque where candidates are
popped from the beginning and pushed to the end.
|
msg2735 (view) |
Author: jendrik |
Date: 2013-11-30.18:04:43 |
|
I don't know how the code corresponds to a BFS or DFS, but I could try to produce
the numbers you requested in msg2719. Should I do that?
|
msg2723 (view) |
Author: malte |
Date: 2013-11-23.19:59:42 |
|
Ah, thanks, I was misled by bitbucket's diff context, which only shows the
too-heavy test. I should have expanded the view.
Still, if I see it correctly, all that happens here is that we modify the order
of successor generation in a breadth-first search. Right? (Or do we perform a
depth-first search?)
Whatever the successor order, in a BFS we proceed layer-by-layer through the
space of candidates, and we abort if we have explored the whole tree (which
doesn't depend on the order) or time out (which does depend on the order, but
isn't reproducible either way).
So I don't think the added code and time complexity offsets the benefits here.
|
msg2722 (view) |
Author: jendrik |
Date: 2013-11-23.19:50:10 |
|
operator_unbalanced() has side effects since it adds new candidates.
|
msg2721 (view) |
Author: malte |
Date: 2013-11-23.19:44:09 |
|
Hmmm, this doesn't affect the outcome of the algorithm (except that it may be
faster or slower, but not in a predictable way). In a loop like
for x in y:
if f(x):
return False
the order in which y is traversed doesn't matter unless f has side effects
(which it doesn't here).
I would recommend to revert this change.
Also, while looking at the source code of BalanceChecker.get_heavy_action, it
seems to rely on the assumption that action names are unique, which I think
isn't necessarily true after normalization. What do you think?
|
msg2720 (view) |
Author: jendrik |
Date: 2013-11-23.19:37:21 |
|
The change that I made and Gabi tested can be seen here (if you click on the
"Diff" button):
https://bitbucket.org/jendrikseipp/downward/branch/issue22
|
msg2719 (view) |
Author: malte |
Date: 2013-11-22.21:35:03 |
|
Because it involves a timeout, invariant synthesis will never be deterministic
in a strong sense. For cases that don't involve the timeout or cutoff, I'm not
sure if the change discussed in msg2693 actually makes a difference. Can I have
a look at the changeset somewhere?
Maybe we should add some stronger cutoff criteria so that we don't hit the time
limit as often as we currently do. This might also help with determinism.
I suggest the following experiment: for every task that the translator can
handle (including domain formulations we don't have in our standard benchmark
suites, such as airport-adl), report the following data:
- did invariant synthesis terminate normally
- how much time did invariant synthesis take
- how many candidates were seen
- how many invariants were found
- what is the length of each of these invariants, *before* grounding (i.e.,
don't convert to "fact groups", stay with the first-order representation)
Simple prints are enough; we don't need a lab report for this.
|
msg2717 (view) |
Author: malte |
Date: 2013-11-22.21:22:57 |
|
Moved over from issue22. This is about the part of that issue described in
msg2685 and later.
Summary: invariant synthesis in the translator isn't always deterministic, and
we'd like to do something about this. It can also be quite slow.
Quoting from msg2707:
========================================================================
There is some significant performance degeneration in openstacks-strips.
I assume that the problem is that the operator description in this domain are
already grounded, so there are lots of actions to be processed by the
invariance analysis.
Jendrik, I think we should try to catch this non-determinism earlier so that we
do not need to sort the threatening actions so often. Can you check whether
the difference occurs first with the actions_to_check set or already earlier
in the sets with the threatening actions for a predicate?
If someone is interested, hre are the problematic translator times:
<task> <old translator time> -> <new translator time>
openstacks-strips/p06.pddl 40.6s -> 60.8s
openstacks-strips/p07.pddl 44.4s -> 60.8s
openstacks-strips/p08.pddl 56.8s -> 300.6s
openstacks-strips/p09.pddl 53.6s -> 300.6s
openstacks-strips/p10.pddl 41.0s -> 300.8s
openstacks-strips/p11.pddl 40.0s -> 300.7s
openstacks-strips/p12.pddl 45.3s -> 300.8s
openstacks-strips/p13.pddl 36.9s -> 300.8s
openstacks-strips/p14.pddl 41.0s -> 300.8s
openstacks-strips/p15.pddl 44.9s -> 300.8s
openstacks-strips/p16.pddl 40.3s -> 300.8s
openstacks-strips/p17.pddl 45.3s -> 301.0s
openstacks-strips/p18.pddl 45.6s -> 302.2s
openstacks-strips/p19.pddl 38.1s -> 301.2s
openstacks-strips/p20.pddl 38.4s -> 301.7s
openstacks-strips/p21.pddl 40.1s -> 301.7s
openstacks-strips/p22.pddl 38.7s -> 301.8s
openstacks-strips/p23.pddl 38.4s -> 301.8s
openstacks-strips/p24.pddl 51.6s -> 309.5s
openstacks-strips/p25.pddl 54.3s -> 308.9s
openstacks-strips/p26.pddl 42.9s -> 305.2s
openstacks-strips/p27.pddl 47.9s -> 312.2s
openstacks-strips/p28.pddl 116.7s -> 357.8s
openstacks-strips/p29.pddl 107.8s -> 352.1s
openstacks-strips/p30.pddl 62.5s -> 321.3s
========================================================================
|
|
Date |
User |
Action |
Args |
2023-07-21 14:43:52 | malte | set | status: chatting -> resolved messages:
+ msg11153 |
2014-10-04 19:57:57 | malte | set | keyword:
+ translator |
2013-12-03 17:43:06 | jendrik | set | messages:
+ msg2757 |
2013-12-03 13:39:04 | malte | set | messages:
+ msg2756 |
2013-12-03 13:27:16 | jendrik | set | messages:
+ msg2755 |
2013-12-03 12:56:29 | malte | set | messages:
+ msg2754 |
2013-12-03 12:05:39 | jendrik | set | files:
+ analyze-invariant-generation-alldomains-eval.tar.gz messages:
+ msg2753 |
2013-12-02 17:44:53 | gabi | set | messages:
+ msg2744 |
2013-12-02 14:27:35 | malte | set | messages:
+ msg2741 |
2013-12-02 14:25:33 | malte | set | messages:
+ msg2740 |
2013-12-01 12:27:44 | malte | set | messages:
+ msg2738 |
2013-12-01 11:37:27 | jendrik | set | files:
+ analyze-invariant-generation-eval.tar.gz messages:
+ msg2737 |
2013-11-30 18:06:17 | malte | set | messages:
+ msg2736 |
2013-11-30 18:04:43 | jendrik | set | messages:
+ msg2735 |
2013-11-23 19:59:42 | malte | set | messages:
+ msg2723 |
2013-11-23 19:50:10 | jendrik | set | messages:
+ msg2722 |
2013-11-23 19:44:09 | malte | set | messages:
+ msg2721 |
2013-11-23 19:37:21 | jendrik | set | messages:
+ msg2720 |
2013-11-22 21:35:03 | malte | set | messages:
+ msg2719 |
2013-11-22 21:22:57 | malte | create | |