msg2147 (view) |
Author: jendrik |
Date: 2012-05-01.12:08:42 |
|
I opened a new issue for the python2.6 warning: issue332.
|
msg2145 (view) |
Author: malte |
Date: 2012-04-29.14:24:17 |
|
> Are you aware that we would have to bundle python2.7 or pass --force for
> all our experiments on the gkigrid then?
Yes -- but it's only the experiments where we actually run the translator, which
at least in my case is much less than 5%. I think it'd be a good reminder for us
that these runtimes should not be used for anything.
Do we have a way in the scripts to pass options to the translator?
> Shouldn't we wait until python2.7 is available there?
I don't think so. The typical upgrade cycle for compute infrastructure like this
is four years or more. The default Python at the compute cluster in Basel is
Python 2.4 (although, to be fair, python2.6 is available too; it's just not the
default that you get with "Python").
The main thing we should care about here is that people who use our code to
compare their planners against report runtimes that we can live with. As long as
the planner works out of the box with Python 2.6, I would expect that in 80% or
more of the cases this will not be the case.
|
msg2144 (view) |
Author: jendrik |
Date: 2012-04-29.14:15:05 |
|
Are you aware that we would have to bundle python2.7 or pass --force for
all our experiments on the gkigrid then? Shouldn't we wait until
python2.7 is available there?
|
msg2143 (view) |
Author: jendrik |
Date: 2012-04-29.14:15:03 |
|
1) Yes I can.
|
msg2142 (view) |
Author: malte |
Date: 2012-04-29.14:09:01 |
|
Buildbot tests seem to work, thanks guys! :-)
Two more things:
1) I think we really need a big fat warning when Python 2.6 is used. Actually, I
think the planner should even refuse to run unless it is passed a special option
("--force" or whatever). ("Errors should never pass silently, unless explicitly
silenced", and major performance degradation definitely is an error in our field.)
But this can be a separate issue. Jendrik, can you take care of this?
2) Once 1) is done, an email should be sent to the Google group.
|
msg2141 (view) |
Author: malte |
Date: 2012-04-29.13:35:54 |
|
Ah, OK. It'd be good to work on the branches in the future to make sure we got
all the changes in a place where we can find them again.
"Closing" a branch is a bit of a misnomer -- they can be opened and closed an
arbitrary number of times. A branch is just a piece of metadata associated with
a revision that is inherited from the parent (or first parent, in case of a
merge), and "closing" a branch just means adding a piece of global metadata that
causes the branch to not be reported by "hg branch" by default.
So it's no problem to use "hg branch" again with a branch that is already
closed, or to update to a closed branch and keep working on it. (It will be
considered open again as soon as a commit with that branch name is made.)
|
msg2140 (view) |
Author: jendrik |
Date: 2012-04-29.13:29:36 |
|
The issue329 branch had already been closed, so I directly pushed the changes in
misc to master.
|
msg2139 (view) |
Author: malte |
Date: 2012-04-29.13:20:10 |
|
Thanks, Erez!
Jendrik, I don't see your latest changes in your repository. The last changes
there were on April 16.
Can you push what you mention in msg2133 to the issue329 branch?
|
msg2138 (view) |
Author: erez |
Date: 2012-04-29.09:07:07 |
|
I've incorporated the quick test in the full build (5 minutes after every
commit), and the "./test-translator first" in the nightly and weekly builds.
The change is in branch issue329test in my downward-fixes repository
(ssh://hg@bitbucket.org/batman/downward-fixes).
I'm afraid I can't really test this, since I don't have a build master.
Once you update the master.cfg file, don't forget to tell the build master to
reload it.
|
msg2134 (view) |
Author: jendrik |
Date: 2012-04-27.12:09:47 |
|
Adding erez to the discussion.
I would recommend running ./test-translator without arguments on every commit
(84% code coverage) and "./test-translator first" nightly (85% code coverage).
|
msg2133 (view) |
Author: jendrik |
Date: 2012-04-27.12:05:02 |
|
Done. Citing the documentation:
# Run small test on a few problems.
# $ ./test-translator
#
# Run bigger test on the first problem of each domain.
# $ ./test-translator first
#
# Run test on specific problems.
# $ ./test-translator gripper:prob01.pddl depot:pfile1
#
# Returns 0 on success and 1 if an error occurs.
|
msg2132 (view) |
Author: malte |
Date: 2012-04-27.10:55:43 |
|
Your choice, but it'd be good if it had a decent code coverage.
|
msg2131 (view) |
Author: jendrik |
Date: 2012-04-27.09:45:02 |
|
OK, then let's have a big and a small test. The big one runs all first
problems now. Which problems shell the small one run?
|
msg2130 (view) |
Author: malte |
Date: 2012-04-25.13:59:26 |
|
OK, discussion on the mailing list seems to have petered out.
The remaining TODO is to set up the tests for the buildbot. (As discussed on the
list, this might require defining a simpler test to run on every commit in
addition to the more complex ones to run daily.)
|
msg2129 (view) |
Author: jendrik |
Date: 2012-04-18.18:19:21 |
|
I pushed the code and sent the mail.
|
msg2128 (view) |
Author: malte |
Date: 2012-04-18.15:02:10 |
|
Hmmm, I wrote a long comment and somehow lost it. Shorter version:
- Python 2.6 is indeed much slower than 2.7 or 3.2 here. I verified this locally
on PSR-Large #50: 730 seconds with Python 2.6; 108 seconds with Python 2.7.
This is not due to wrong time measurement. The performance difference is
actually there.
- Considering this, I would say that we should require Python >= 2.7 in the
future. We should adjust the hash-bang lines, the new-scripts, the
new-new-scripts, and add a warning to the code if it's manually called with
an older Python.
- Since upgrading to a new Python will make the translator much faster,
I don't mind the 10% of performance loss we get through Python 3
compatibility, so the code for this issue can be merged.
- The buildbot should automatically test Python 2.6, 2.7 and 3.x
compatibility as a new test. This may require Erez to install 2.7 and 3.x.
- All of the above, and also the background of this (that you ported the
planner to 3.x and what the performance number are) should be discussed on
downward-dev. Can you send a summary of everything to the list?
- Once downward-dev is happy, we should send an announcement to the
Google group.
|
msg2127 (view) |
Author: jendrik |
Date: 2012-04-18.13:50:06 |
|
Here are the results for python2.6 with a manually compiled python.
|
msg2126 (view) |
Author: jendrik |
Date: 2012-04-18.00:15:03 |
|
This may indeed be the cause of the strange times. Habakuk (and probably
the cluster) has python 2.6.0 and the bug in os.times() (Issue #1040026:
Fix os.times result on systems where HZ is incorrect.) was fixed in 2.6.2.
I am running the experiment with 2.6.8 now, so we should be fine. The
other versions are not affected either.
|
msg2125 (view) |
Author: malte |
Date: 2012-04-17.21:36:18 |
|
Another point to consider if runtime results are strange: Python had a bug in
os.times() for a long time. It may be worth also measuring translator runtime
externally with the bash "time" and/or "times" commands. ("time" is usually
easier to work with.)
|
msg2124 (view) |
Author: jendrik |
Date: 2012-04-17.19:40:03 |
|
The 2.7 and 3.2 experiments have been done with manually compiled python
versions. I'll rerun the 2.6 experiment with a manually compiled python
as well (measuring also the memory).
|
msg2123 (view) |
Author: jendrik |
Date: 2012-04-17.19:15:02 |
|
Only for Python 2.7, not fpr Python 2.6. Shall I run the experiment again?
|
msg2122 (view) |
Author: malte |
Date: 2012-04-17.16:47:50 |
|
Do we have memory usage information for Python 2.x, too? (Can't check at the
moment; I'm on the road and the attachments are too large for my connection.)
If not, that would be another useful thing to add.
|
msg2121 (view) |
Author: jendrik |
Date: 2012-04-17.14:45:03 |
|
Here are the times and memory usage for python3.2.
How do we proceed from here?
|
msg2119 (view) |
Author: jendrik |
Date: 2012-04-17.10:50:03 |
|
Yes, the two experiments were both run on opteron_core.q. I'll run
another experiment for python3.x.
|
msg2118 (view) |
Author: malte |
Date: 2012-04-17.08:38:26 |
|
Are these on the same machines? I'd be interested in the 2.6 vs. 2.7 vs 3.x
runtimes, but here the 2.7 runtimes appear to be more than three times smaller
than the 2.6 ones (looking at "translator_time_done"), which looks a bit suspicious.
|
msg2116 (view) |
Author: jendrik |
Date: 2012-04-16.22:50:04 |
|
These results compare translator times and memory peaks for *python2.7*
(not python3.2 as the unfortunate naming js-translator3-2 might suggest,
the 2 was supposed to stand for 2nd experiment). The numbers look very
similar to the 2.6 results.
|
msg2109 (view) |
Author: malte |
Date: 2012-04-15.23:02:47 |
|
> > What are the numbers for memory?
>
> Those are not measured currently. How do you suppose we should measure them?
One easy way is to follow the approach used in the search code; see
get_peak_memory_in_kb in src/search/utilities.cc. This needs a test that we are
on a Linux platform; if not, it is OK to report that peak memory numbers are
unavailable.
|
msg2108 (view) |
Author: jendrik |
Date: 2012-04-15.21:15:03 |
|
> What are the numbers for memory?
Those are not measured currently. How do you suppose we should measure them?
> Can we get numbers for Python 2.7 and Python 3.2?
We would have to compile those manually first for the cluster and write
a more complicated experiment.
Maybe we should have a look at the code again first. From what I see the
parts that perform significantly worse are computing_fact_groups,
detecting_unreachable_propositions and instantiating_groups and
writing_output. I have tried looking at the code for the first three,
but haven't been able to find possible changes that are responsible for
the performance decline. Do you have any suspects that we could fix
while remaining python3 compatible?
|
msg2107 (view) |
Author: malte |
Date: 2012-04-15.20:00:29 |
|
Hmmm, roughly 10% loss in performance is not trivial, unfortunately.
What are the numbers for memory?
Can we get numbers for Python 2.7 and Python 3.2?
|
msg2106 (view) |
Author: jendrik |
Date: 2012-04-15.13:40:04 |
|
Here are the scatter plots comparing the two code versions for python2.6.
|
msg2105 (view) |
Author: malte |
Date: 2012-04-13.18:09:17 |
|
Code review done.
|
msg2103 (view) |
Author: jendrik |
Date: 2012-04-13.17:35:02 |
|
I just fixed that, it now runs all available versions from the set {2.6,
2.7, 3}.
>
> Some of the changes look like they could affect performance on Python 2. Can we
> do a before/after scatter-plot on translator time?
I would like to wait until after the code review for that. If there are
still more changes, we might have to do the same experiment twice...
|
msg2102 (view) |
Author: malte |
Date: 2012-04-13.17:00:26 |
|
(PS: Sounds like a lot of work, but once we have set something like this up, it
would also be useful for other changes to the translator we might do in the future.)
|
msg2101 (view) |
Author: malte |
Date: 2012-04-13.16:59:56 |
|
Thanks! It may take me some time to look at this properly, unfortunately.
It looks like the test script doesn't test Python 2.6?
Some of the changes look like they could affect performance on Python 2. Can we
do a before/after scatter-plot on translator time?
It would be great to also do the same thing for memory usage, and maybe also a
Python 2.6/2.7/3.x comparison on the same metrics.
|
msg2100 (view) |
Author: jendrik |
Date: 2012-04-13.16:55:03 |
|
I have added scripts that translate all first problems in each domain
for the two python versions. Also I set up a Rietveld issue at
http://codereview.appspot.com/6028046/.
|
msg2097 (view) |
Author: malte |
Date: 2012-04-13.12:00:23 |
|
Sounds good! It's easy to break compatibility, so it would be good if we had an
automated way to test this. Something very simple (run the planner with all
three python versions on some input and check that it completes) would be a good
start.
|
msg2096 (view) |
Author: jendrik |
Date: 2012-04-13.11:49:27 |
|
We should support support all common python versions: 2.6, 2.7 and also 3.x
As long as we don't use new python3 language features, it seems we can support
both major versions with a single codebase.
|
|
Date |
User |
Action |
Args |
2012-05-01 12:09:07 | jendrik | set | status: chatting -> resolved |
2012-05-01 12:08:43 | jendrik | set | superseder:
+ do not run translator with python2.6 messages:
+ msg2147 |
2012-04-29 14:24:17 | malte | set | messages:
+ msg2145 |
2012-04-29 14:15:05 | jendrik | set | messages:
+ msg2144 |
2012-04-29 14:15:03 | jendrik | set | messages:
+ msg2143 |
2012-04-29 14:09:02 | malte | set | messages:
+ msg2142 |
2012-04-29 13:35:54 | malte | set | messages:
+ msg2141 |
2012-04-29 13:29:37 | jendrik | set | messages:
+ msg2140 |
2012-04-29 13:20:10 | malte | set | messages:
+ msg2139 |
2012-04-29 09:07:08 | erez | set | messages:
+ msg2138 |
2012-04-27 12:09:47 | jendrik | set | nosy:
+ erez messages:
+ msg2134 |
2012-04-27 12:05:02 | jendrik | set | messages:
+ msg2133 |
2012-04-27 10:55:43 | malte | set | messages:
+ msg2132 |
2012-04-27 09:45:03 | jendrik | set | messages:
+ msg2131 |
2012-04-25 13:59:26 | malte | set | messages:
+ msg2130 |
2012-04-18 18:19:22 | jendrik | set | messages:
+ msg2129 |
2012-04-18 15:02:11 | malte | set | messages:
+ msg2128 |
2012-04-18 13:50:06 | jendrik | set | files:
+ js-translator-python3-2.6-eval.tar.gz messages:
+ msg2127 |
2012-04-18 13:49:19 | jendrik | set | files:
- unnamed |
2012-04-18 13:48:32 | jendrik | set | files:
- js-translator-python3-eval.tar.gz |
2012-04-18 00:15:03 | jendrik | set | files:
+ unnamed messages:
+ msg2126 |
2012-04-17 21:36:18 | malte | set | messages:
+ msg2125 |
2012-04-17 19:40:03 | jendrik | set | messages:
+ msg2124 |
2012-04-17 19:15:02 | jendrik | set | messages:
+ msg2123 |
2012-04-17 16:47:50 | malte | set | messages:
+ msg2122 |
2012-04-17 14:45:03 | jendrik | set | files:
+ js-translator-python3-3.2-eval.tar.gz messages:
+ msg2121 |
2012-04-17 10:50:03 | jendrik | set | messages:
+ msg2119 |
2012-04-17 08:38:26 | malte | set | messages:
+ msg2118 |
2012-04-16 22:50:04 | jendrik | set | files:
+ js-translator-python3-2.7-eval.tar.gz messages:
+ msg2116 |
2012-04-15 23:02:47 | malte | set | messages:
+ msg2109 |
2012-04-15 21:15:03 | jendrik | set | messages:
+ msg2108 |
2012-04-15 20:00:29 | malte | set | messages:
+ msg2107 |
2012-04-15 13:40:05 | jendrik | set | files:
+ js-translator-python3-eval.tar.gz messages:
+ msg2106 |
2012-04-13 18:09:17 | malte | set | messages:
+ msg2105 |
2012-04-13 17:35:02 | jendrik | set | messages:
+ msg2103 |
2012-04-13 17:00:26 | malte | set | messages:
+ msg2102 |
2012-04-13 16:59:56 | malte | set | messages:
+ msg2101 |
2012-04-13 16:55:03 | jendrik | set | messages:
+ msg2100 |
2012-04-13 12:00:23 | malte | set | status: unread -> chatting messages:
+ msg2097 |
2012-04-13 11:49:27 | jendrik | create | |