Issue132

Title invariant synthesis is sometimes incorrect
Priority bug Status resolved
Superseder Nosy List gabi, malte
Assigned To gabi Keywords 1.0
Optional summary

Created on 2010-10-18.18:29:00 by gabi, last changed by gabi.

Messages
msg1056 (view) Author: gabi Date: 2011-01-05.19:30:03
:-)

-> issue189

-> issue187

   * an empirical evaluation of the impact on the m&s heuristic
-> issue188
msg1055 (view) Author: malte Date: 2011-01-05.19:29:04
Correction: the evaluation of impact is split into two parts, one near-term
(issue188) and one longer-term (issue187).
msg1054 (view) Author: malte Date: 2011-01-05.19:28:06
The two new issues are: issue189 (refactoring), issue188 (evaluation of impact).
msg1050 (view) Author: malte Date: 2011-01-05.19:17:24
The code has been merged. Thanks for this wonderful work!

Gabi will open two new issues regarding:
 * some small refactorings of the new code that would be good but shouldn't
   be done right now to avoid risking anything for the IPC, and
 * cases where we have potential performance problems of various kinds
   (speed, fewer invariants, etc.) compared to the old, wrong code that we
   should investigate more closely.
msg874 (view) Author: gabi Date: 2010-12-16.17:45:42
I removed the experiment results because there was a bug in the experiment
scripts that ran the experiments with a wrong revision.
msg863 (view) Author: malte Date: 2010-12-14.23:47:36
> The code is ready for being reviewed:
> http://codereview.appspot.com/3634041

Hmmm, since there are such large chunks of new code, I'd prefer to read the
whole thing on paper instead of using Rietveld. That way I get syntax
highlighting and a better overall view of the thing. Is that OK?
msg862 (view) Author: malte Date: 2010-12-14.23:38:15
Well, great work!

Here's initial feedback. So far I only looked at the experimental results,
haven't looked at the code yet. From the experiments:

 1. The only cases where I'd like to see a closer look at what goes on are
    the few cases in trucks-strips where the step "Finding invariants" becomes
    substantially slower. There are a few instances there that behave much
    worse than the rest (in the range p19-p26, but especially p20-p22),
    and it would be good to know why.

 2. It would be good to compare the actual invariants computed, before and
    after. For that it's not necessary to run the whole translator (and
    it wouldn't output that info anyway). Instead, use "invariant_finder.py"
    as the main script, but first comment out everything starting from
    'print "Finding fact groups..."'.
    For later comparison, the output should be made canonical in some way.
    Sorting the entries of each printed invariant alphabetically and then
    sorting the invariants alphabetically by line would be sufficient for
    this.
    I don't know if and how easily the new_scripts currently support using
    a different top-level script than translate.py. If this is too difficult
    to change, we could of course create two new revisions for the test where
    translate.py is replaced by something that just contains the main code
    from invariant_finder.py. (But on the other hand, being able to make
    adaptations such as this one for an experiment would be a nice feature
    to support in general.)
msg854 (view) Author: malte Date: 2010-12-14.15:15:16
> It would be interesting to also look at the time for the actual invariant
> synthesis, which I think is printed separately, but I don't see it in the
> report. Or am I misremembering/missing something?

Never mind, it's already there. I was looking at the psr-large change results,
and it's probably missing there simply because there are no tasks where the
number varies a lot between the configurations.
msg853 (view) Author: malte Date: 2010-12-14.14:53:53
> The reference numbers for the invariant groups stay the same, so we
> probably find exactly the same invariants. For the time values,
> the overall picture is that everything takes slightly longer, but
> in most cases only negligible. The reason for this could also
> be a different workload of the grid, because this general trend
> appears also for the things that are executed before the invariant
> synthesis starts.

It would be interesting to also look at the time for the actual invariant
synthesis, which I think is printed separately, but I don't see it in the
report. Or am I misremembering/missing something?
msg852 (view) Author: gabi Date: 2010-12-14.14:45:48
I added for psr-large and trucks-strips reports (ending with -change), that show
all changes of more than 5 (whatever - concentrate on the results in seconds). A
positive number means that the new version is slower.
msg850 (view) Author: gabi Date: 2010-12-14.13:56:50
The code is ready for being reviewed:
http://codereview.appspot.com/3634041
msg849 (view) Author: gabi Date: 2010-12-14.13:55:17
I ran some experiments on the benchmark suite. 

The overall impact can be seen in file preprocessedtasks-d-abs.html.
213202a44ebe is the original revision, 7bc6764d4116 the new one.

The reference numbers for the invariant groups stay the same, so we
probably find exactly the same invariants. For the time values,
the overall picture is that everything takes slightly longer, but
in most cases only negligible. The reason for this could also
be a different workload of the grid, because this general trend
appears also for the things that are executed before the invariant
synthesis starts.

To make the more interesting changes more clear, I removed everything that stays
more or less the same or cannot be affected by the changes (because it occurs
earlier in the execution) from the evaluation (see
preprocessedtasks-d-abs-simple.html).

The only domains that might really be critical are psr-large and
trucks-strips. Hence I also added detailed results for these domains.
Now it would be cool if our evaluation scripts could give me a list
of the instances for which the change in the total translation time
differs by more than 5 seconds. I guess they already can do that, so
I will ask Jendrik, how this works.
msg848 (view) Author: gabi Date: 2010-12-14.12:08:20
I implemented a new version along the lines of Malte's AI 2009 article (but
keeping the restrictions on the invariant generation that were
already present in the old implementation).

This actually solves issue70 (unfortunately by finding no mutex groups 
anymore). Issue86 turned out to be a different problem.
msg573 (view) Author: gabi Date: 2010-10-18.18:29:00
The invariant synthesis needs to check that each add effect
has really and in any case a matching del effect (as described
in Malte's translator AIJ paper).

This will probably resolve issue 70 and issue 86.
History
Date User Action Args
2011-01-05 19:30:35gabisetstatus: chatting -> resolved
2011-01-05 19:30:03gabisetstatus: resolved -> chatting
messages: + msg1056
2011-01-05 19:29:04maltesetstatus: chatting -> resolved
messages: + msg1055
2011-01-05 19:28:06maltesetstatus: resolved -> chatting
messages: + msg1054
2011-01-05 19:17:24maltesetstatus: testing -> resolved
messages: + msg1050
2010-12-16 17:45:42gabisetstatus: reviewing -> testing
messages: + msg874
2010-12-16 17:44:05gabisetfiles: - preprocessedtasks-trucks-strips-p-abs.html
2010-12-16 17:44:03gabisetfiles: - preprocessedtasks-trucks-strips-p-change.html
2010-12-16 17:44:00gabisetfiles: - preprocessedtasks-psr-large-p-change.html
2010-12-16 17:43:57gabisetfiles: - preprocessedtasks-psr-large-p-abs.html
2010-12-16 17:43:53gabisetfiles: - preprocessedtasks-d-abs.html
2010-12-16 17:43:46gabisetfiles: - preprocessedtasks-d-abs-simple.html
2010-12-14 23:47:36maltesetmessages: + msg863
2010-12-14 23:38:15maltesetmessages: + msg862
2010-12-14 15:15:16maltesetmessages: + msg854
2010-12-14 14:53:53maltesetmessages: + msg853
2010-12-14 14:45:48gabisetmessages: + msg852
2010-12-14 14:43:47gabisetfiles: + preprocessedtasks-psr-large-p-change.html
2010-12-14 14:43:33gabisetfiles: + preprocessedtasks-trucks-strips-p-change.html
2010-12-14 13:56:50gabisetstatus: testing -> reviewing
messages: + msg850
2010-12-14 13:55:17gabisetmessages: + msg849
2010-12-14 13:44:16gabisetfiles: + preprocessedtasks-d-abs-simple.html
2010-12-14 13:43:52gabisetfiles: + preprocessedtasks-psr-large-p-abs.html
2010-12-14 12:19:59gabisetfiles: + preprocessedtasks-d-abs.html
2010-12-14 12:19:43gabisetfiles: - preprocessedtasks-p-abs.html
2010-12-14 12:11:03gabisetfiles: + preprocessedtasks-trucks-strips-p-abs.html
2010-12-14 12:10:51gabisetfiles: + preprocessedtasks-p-abs.html
2010-12-14 12:08:21gabisetstatus: chatting -> testing
messages: + msg848
2010-10-18 18:29:00gabicreate