Issue329

Title add python3 support for translator
Priority feature Status resolved
Superseder do not run translator with python2.6
View: 332
Nosy List erez, jendrik, malte
Assigned To jendrik Keywords
Optional summary

Created on 2012-04-13.11:49:27 by jendrik, last changed by jendrik.

Files
File name Uploaded Type Edit Remove
js-translator-python3-2.6-eval.tar.gz jendrik, 2012-04-18.13:50:06 application/x-gzip
js-translator-python3-2.7-eval.tar.gz jendrik, 2012-04-16.22:50:04 application/x-gzip
js-translator-python3-3.2-eval.tar.gz jendrik, 2012-04-17.14:45:03 application/x-gzip
Messages
msg2147 (view) Author: jendrik Date: 2012-05-01.12:08:42
I opened a new issue for the python2.6 warning: issue332.
msg2145 (view) Author: malte Date: 2012-04-29.14:24:17
> Are you aware that we would have to bundle python2.7 or pass --force for 
> all our experiments on the gkigrid then?

Yes -- but it's only the experiments where we actually run the translator, which
at least in my case is much less than 5%. I think it'd be a good reminder for us
that these runtimes should not be used for anything.

Do we have a way in the scripts to pass options to the translator?

> Shouldn't we wait until python2.7 is available there?

I don't think so. The typical upgrade cycle for compute infrastructure like this
is four years or more. The default Python at the compute cluster in Basel is
Python 2.4 (although, to be fair, python2.6 is available too; it's just not the
default that you get with "Python").

The main thing we should care about here is that people who use our code to
compare their planners against report runtimes that we can live with. As long as
the planner works out of the box with Python 2.6, I would expect that in 80% or
more of the cases this will not be the case.
msg2144 (view) Author: jendrik Date: 2012-04-29.14:15:05
Are you aware that we would have to bundle python2.7 or pass --force for 
all our experiments on the gkigrid then? Shouldn't we wait until 
python2.7 is available there?
msg2143 (view) Author: jendrik Date: 2012-04-29.14:15:03
1) Yes I can.
msg2142 (view) Author: malte Date: 2012-04-29.14:09:01
Buildbot tests seem to work, thanks guys! :-)

Two more things:

1) I think we really need a big fat warning when Python 2.6 is used. Actually, I
think the planner should even refuse to run unless it is passed a special option
("--force" or whatever). ("Errors should never pass silently, unless explicitly
silenced", and major performance degradation definitely is an error in our field.)

But this can be a separate issue. Jendrik, can you take care of this?

2) Once 1) is done, an email should be sent to the Google group.
msg2141 (view) Author: malte Date: 2012-04-29.13:35:54
Ah, OK. It'd be good to work on the branches in the future to make sure we got
all the changes in a place where we can find them again.

"Closing" a branch is a bit of a misnomer -- they can be opened and closed an
arbitrary number of times. A branch is just a piece of metadata associated with
a revision that is inherited from the parent (or first parent, in case of a
merge), and "closing" a branch just means adding a piece of global metadata that
causes the branch to not be reported by "hg branch" by default.

So it's no problem to use "hg branch" again with a branch that is already
closed, or to update to a closed branch and keep working on it. (It will be
considered open again as soon as a commit with that branch name is made.)
msg2140 (view) Author: jendrik Date: 2012-04-29.13:29:36
The issue329 branch had already been closed, so I directly pushed the changes in 
misc to master.
msg2139 (view) Author: malte Date: 2012-04-29.13:20:10
Thanks, Erez!

Jendrik, I don't see your latest changes in your repository. The last changes
there were on April 16.

Can you push what you mention in msg2133 to the issue329 branch?
msg2138 (view) Author: erez Date: 2012-04-29.09:07:07
I've incorporated the quick test in the full build (5 minutes after every 
commit), and the "./test-translator first" in the nightly and weekly builds.

The change is in branch  issue329test   in my  downward-fixes repository 
(ssh://hg@bitbucket.org/batman/downward-fixes).

I'm afraid I can't really test this, since I don't have a build master.
Once you update the master.cfg file, don't forget to tell the build master to 
reload it.
msg2134 (view) Author: jendrik Date: 2012-04-27.12:09:47
Adding erez to the discussion. 

I would recommend running ./test-translator without arguments on every commit 
(84% code coverage) and "./test-translator first" nightly (85% code coverage).
msg2133 (view) Author: jendrik Date: 2012-04-27.12:05:02
Done. Citing the documentation:

# Run small test on a few problems.
# $ ./test-translator
#
# Run bigger test on the first problem of each domain.
# $ ./test-translator first
#
# Run test on specific problems.
# $ ./test-translator gripper:prob01.pddl depot:pfile1
#
# Returns 0 on success and 1 if an error occurs.
msg2132 (view) Author: malte Date: 2012-04-27.10:55:43
Your choice, but it'd be good if it had a decent code coverage.
msg2131 (view) Author: jendrik Date: 2012-04-27.09:45:02
OK, then let's have a big and a small test. The big one runs all first 
problems now. Which problems shell the small one run?
msg2130 (view) Author: malte Date: 2012-04-25.13:59:26
OK, discussion on the mailing list seems to have petered out.

The remaining TODO is to set up the tests for the buildbot. (As discussed on the
list, this might require defining a simpler test to run on every commit in
addition to the more complex ones to run daily.)
msg2129 (view) Author: jendrik Date: 2012-04-18.18:19:21
I pushed the code and sent the mail.
msg2128 (view) Author: malte Date: 2012-04-18.15:02:10
Hmmm, I wrote a long comment and somehow lost it. Shorter version:

- Python 2.6 is indeed much slower than 2.7 or 3.2 here. I verified this locally
  on PSR-Large #50: 730 seconds with Python 2.6; 108 seconds with Python 2.7.
  This is not due to wrong time measurement. The performance difference is
  actually there.

- Considering this, I would say that we should require Python >= 2.7 in the
  future. We should adjust the hash-bang lines, the new-scripts, the
  new-new-scripts, and add a warning to the code if it's manually called with
  an older Python.

- Since upgrading to a new Python will make the translator much faster,
  I don't mind the 10% of performance loss we get through Python 3
  compatibility, so the code for this issue can be merged.

- The buildbot should automatically test Python 2.6, 2.7 and 3.x
  compatibility as a new test. This may require Erez to install 2.7 and 3.x.

- All of the above, and also the background of this (that you ported the
  planner to 3.x and what the performance number are) should be discussed on
  downward-dev. Can you send a summary of everything to the list?

- Once downward-dev is happy, we should send an announcement to the
  Google group.
msg2127 (view) Author: jendrik Date: 2012-04-18.13:50:06
Here are the results for python2.6 with a manually compiled python.
msg2126 (view) Author: jendrik Date: 2012-04-18.00:15:03
This may indeed be the cause of the strange times. Habakuk (and probably 
the cluster) has python 2.6.0 and the bug in os.times() (Issue #1040026: 
Fix os.times result on systems where HZ is incorrect.) was fixed in 2.6.2.
I am running the experiment with 2.6.8 now, so we should be fine. The 
other versions are not affected either.
msg2125 (view) Author: malte Date: 2012-04-17.21:36:18
Another point to consider if runtime results are strange: Python had a bug in
os.times() for a long time. It may be worth also measuring translator runtime
externally with the bash "time" and/or "times" commands. ("time" is usually
easier to work with.)
msg2124 (view) Author: jendrik Date: 2012-04-17.19:40:03
The 2.7 and 3.2 experiments have been done with manually compiled python 
versions. I'll rerun the 2.6 experiment with a manually compiled python 
as well (measuring also the memory).
msg2123 (view) Author: jendrik Date: 2012-04-17.19:15:02
Only for Python 2.7, not fpr Python 2.6. Shall I run the experiment again?
msg2122 (view) Author: malte Date: 2012-04-17.16:47:50
Do we have memory usage information for Python 2.x, too? (Can't check at the
moment; I'm on the road and the attachments are too large for my connection.)

If not, that would be another useful thing to add.
msg2121 (view) Author: jendrik Date: 2012-04-17.14:45:03
Here are the times and memory usage for python3.2.

How do we proceed from here?
msg2119 (view) Author: jendrik Date: 2012-04-17.10:50:03
Yes, the two experiments were both run on opteron_core.q. I'll run 
another experiment for python3.x.
msg2118 (view) Author: malte Date: 2012-04-17.08:38:26
Are these on the same machines? I'd be interested in the 2.6 vs. 2.7 vs 3.x
runtimes, but here the 2.7 runtimes appear to be more than three times smaller
than the 2.6 ones (looking at "translator_time_done"), which looks a bit suspicious.
msg2116 (view) Author: jendrik Date: 2012-04-16.22:50:04
These results compare translator times and memory peaks for *python2.7* 
(not python3.2 as the unfortunate naming js-translator3-2 might suggest, 
the 2 was supposed to stand for 2nd experiment). The numbers look very 
similar to the 2.6 results.
msg2109 (view) Author: malte Date: 2012-04-15.23:02:47
> > What are the numbers for memory?
>
> Those are not measured currently. How do you suppose we should measure them?

One easy way is to follow the approach used in the search code; see
get_peak_memory_in_kb in src/search/utilities.cc. This needs a test that we are
on a Linux platform; if not, it is OK to report that peak memory numbers are
unavailable.
msg2108 (view) Author: jendrik Date: 2012-04-15.21:15:03
> What are the numbers for memory?
Those are not measured currently. How do you suppose we should measure them?

 > Can we get numbers for Python 2.7 and Python 3.2?
We would have to compile those manually first for the cluster and write 
a more complicated experiment.

Maybe we should have a look at the code again first. From what I see the 
parts that perform significantly worse are computing_fact_groups, 
detecting_unreachable_propositions and instantiating_groups and 
writing_output. I have tried looking at the code for the first three, 
but haven't been able to find possible changes that are responsible for 
the performance decline. Do you have any suspects that we could fix 
while remaining python3 compatible?
msg2107 (view) Author: malte Date: 2012-04-15.20:00:29
Hmmm, roughly 10% loss in performance is not trivial, unfortunately.
What are the numbers for memory?

Can we get numbers for Python 2.7 and Python 3.2?
msg2106 (view) Author: jendrik Date: 2012-04-15.13:40:04
Here are the scatter plots comparing the two code versions for python2.6.
msg2105 (view) Author: malte Date: 2012-04-13.18:09:17
Code review done.
msg2103 (view) Author: jendrik Date: 2012-04-13.17:35:02
I just fixed that, it now runs all available versions from the set {2.6, 
2.7, 3}.
>
> Some of the changes look like they could affect performance on Python 2. Can we
> do a before/after scatter-plot on translator time?
I would like to wait until after the code review for that. If there are 
still more changes, we might have to do the same experiment twice...
msg2102 (view) Author: malte Date: 2012-04-13.17:00:26
(PS: Sounds like a lot of work, but once we have set something like this up, it
would also be useful for other changes to the translator we might do in the future.)
msg2101 (view) Author: malte Date: 2012-04-13.16:59:56
Thanks! It may take me some time to look at this properly, unfortunately.
It looks like the test script doesn't test Python 2.6?

Some of the changes look like they could affect performance on Python 2. Can we
do a before/after scatter-plot on translator time?

It would be great to also do the same thing for memory usage, and maybe also a
Python 2.6/2.7/3.x comparison on the same metrics.
msg2100 (view) Author: jendrik Date: 2012-04-13.16:55:03
I have added scripts that translate all first problems in each domain 
for the two python versions. Also I set up a Rietveld issue at 
http://codereview.appspot.com/6028046/.
msg2097 (view) Author: malte Date: 2012-04-13.12:00:23
Sounds good! It's easy to break compatibility, so it would be good if we had an
automated way to test this. Something very simple (run the planner with all
three python versions on some input and check that it completes) would be a good
start.
msg2096 (view) Author: jendrik Date: 2012-04-13.11:49:27
We should support support all common python versions: 2.6, 2.7 and also 3.x

As long as we don't use new python3 language features, it seems we can support 
both major versions with a single codebase.
History
Date User Action Args
2012-05-01 12:09:07jendriksetstatus: chatting -> resolved
2012-05-01 12:08:43jendriksetsuperseder: + do not run translator with python2.6
messages: + msg2147
2012-04-29 14:24:17maltesetmessages: + msg2145
2012-04-29 14:15:05jendriksetmessages: + msg2144
2012-04-29 14:15:03jendriksetmessages: + msg2143
2012-04-29 14:09:02maltesetmessages: + msg2142
2012-04-29 13:35:54maltesetmessages: + msg2141
2012-04-29 13:29:37jendriksetmessages: + msg2140
2012-04-29 13:20:10maltesetmessages: + msg2139
2012-04-29 09:07:08erezsetmessages: + msg2138
2012-04-27 12:09:47jendriksetnosy: + erez
messages: + msg2134
2012-04-27 12:05:02jendriksetmessages: + msg2133
2012-04-27 10:55:43maltesetmessages: + msg2132
2012-04-27 09:45:03jendriksetmessages: + msg2131
2012-04-25 13:59:26maltesetmessages: + msg2130
2012-04-18 18:19:22jendriksetmessages: + msg2129
2012-04-18 15:02:11maltesetmessages: + msg2128
2012-04-18 13:50:06jendriksetfiles: + js-translator-python3-2.6-eval.tar.gz
messages: + msg2127
2012-04-18 13:49:19jendriksetfiles: - unnamed
2012-04-18 13:48:32jendriksetfiles: - js-translator-python3-eval.tar.gz
2012-04-18 00:15:03jendriksetfiles: + unnamed
messages: + msg2126
2012-04-17 21:36:18maltesetmessages: + msg2125
2012-04-17 19:40:03jendriksetmessages: + msg2124
2012-04-17 19:15:02jendriksetmessages: + msg2123
2012-04-17 16:47:50maltesetmessages: + msg2122
2012-04-17 14:45:03jendriksetfiles: + js-translator-python3-3.2-eval.tar.gz
messages: + msg2121
2012-04-17 10:50:03jendriksetmessages: + msg2119
2012-04-17 08:38:26maltesetmessages: + msg2118
2012-04-16 22:50:04jendriksetfiles: + js-translator-python3-2.7-eval.tar.gz
messages: + msg2116
2012-04-15 23:02:47maltesetmessages: + msg2109
2012-04-15 21:15:03jendriksetmessages: + msg2108
2012-04-15 20:00:29maltesetmessages: + msg2107
2012-04-15 13:40:05jendriksetfiles: + js-translator-python3-eval.tar.gz
messages: + msg2106
2012-04-13 18:09:17maltesetmessages: + msg2105
2012-04-13 17:35:02jendriksetmessages: + msg2103
2012-04-13 17:00:26maltesetmessages: + msg2102
2012-04-13 16:59:56maltesetmessages: + msg2101
2012-04-13 16:55:03jendriksetmessages: + msg2100
2012-04-13 12:00:23maltesetstatus: unread -> chatting
messages: + msg2097
2012-04-13 11:49:27jendrikcreate