Issue 329: add python3 support for translator

Title	add python3 support for translator
Priority	feature	Status	resolved
Superseder	do not run translator with python2.6 View: 332	Nosy List	erez, jendrik, malte
Assigned To	jendrik	Keywords
Optional summary

Created on 2012-04-13.11:49:27 by jendrik, last changed by jendrik.

Files
File name	Uploaded	Type	Edit	Remove
js-translator-python3-2.6-eval.tar.gz	jendrik, 2012-04-18.13:50:06	application/x-gzip
js-translator-python3-2.7-eval.tar.gz	jendrik, 2012-04-16.22:50:04	application/x-gzip
js-translator-python3-3.2-eval.tar.gz	jendrik, 2012-04-17.14:45:03	application/x-gzip

Messages
msg2147 (view)	Author: jendrik	Date: 2012-05-01.12:08:42
I opened a new issue for the python2.6 warning: issue332.
msg2145 (view)	Author: malte	Date: 2012-04-29.14:24:17
> Are you aware that we would have to bundle python2.7 or pass --force for > all our experiments on the gkigrid then? Yes -- but it's only the experiments where we actually run the translator, which at least in my case is much less than 5%. I think it'd be a good reminder for us that these runtimes should not be used for anything. Do we have a way in the scripts to pass options to the translator? > Shouldn't we wait until python2.7 is available there? I don't think so. The typical upgrade cycle for compute infrastructure like this is four years or more. The default Python at the compute cluster in Basel is Python 2.4 (although, to be fair, python2.6 is available too; it's just not the default that you get with "Python"). The main thing we should care about here is that people who use our code to compare their planners against report runtimes that we can live with. As long as the planner works out of the box with Python 2.6, I would expect that in 80% or more of the cases this will not be the case.
msg2144 (view)	Author: jendrik	Date: 2012-04-29.14:15:05
Are you aware that we would have to bundle python2.7 or pass --force for all our experiments on the gkigrid then? Shouldn't we wait until python2.7 is available there?
msg2143 (view)	Author: jendrik	Date: 2012-04-29.14:15:03
1) Yes I can.
msg2142 (view)	Author: malte	Date: 2012-04-29.14:09:01
Buildbot tests seem to work, thanks guys! :-) Two more things: 1) I think we really need a big fat warning when Python 2.6 is used. Actually, I think the planner should even refuse to run unless it is passed a special option ("--force" or whatever). ("Errors should never pass silently, unless explicitly silenced", and major performance degradation definitely is an error in our field.) But this can be a separate issue. Jendrik, can you take care of this? 2) Once 1) is done, an email should be sent to the Google group.
msg2141 (view)	Author: malte	Date: 2012-04-29.13:35:54
Ah, OK. It'd be good to work on the branches in the future to make sure we got all the changes in a place where we can find them again. "Closing" a branch is a bit of a misnomer -- they can be opened and closed an arbitrary number of times. A branch is just a piece of metadata associated with a revision that is inherited from the parent (or first parent, in case of a merge), and "closing" a branch just means adding a piece of global metadata that causes the branch to not be reported by "hg branch" by default. So it's no problem to use "hg branch" again with a branch that is already closed, or to update to a closed branch and keep working on it. (It will be considered open again as soon as a commit with that branch name is made.)
msg2140 (view)	Author: jendrik	Date: 2012-04-29.13:29:36
The issue329 branch had already been closed, so I directly pushed the changes in misc to master.
msg2139 (view)	Author: malte	Date: 2012-04-29.13:20:10
Thanks, Erez! Jendrik, I don't see your latest changes in your repository. The last changes there were on April 16. Can you push what you mention in msg2133 to the issue329 branch?
msg2138 (view)	Author: erez	Date: 2012-04-29.09:07:07
I've incorporated the quick test in the full build (5 minutes after every commit), and the "./test-translator first" in the nightly and weekly builds. The change is in branch issue329test in my downward-fixes repository (ssh://hg@bitbucket.org/batman/downward-fixes). I'm afraid I can't really test this, since I don't have a build master. Once you update the master.cfg file, don't forget to tell the build master to reload it.
msg2134 (view)	Author: jendrik	Date: 2012-04-27.12:09:47
Adding erez to the discussion. I would recommend running ./test-translator without arguments on every commit (84% code coverage) and "./test-translator first" nightly (85% code coverage).
msg2133 (view)	Author: jendrik	Date: 2012-04-27.12:05:02
Done. Citing the documentation: # Run small test on a few problems. # $ ./test-translator # # Run bigger test on the first problem of each domain. # $ ./test-translator first # # Run test on specific problems. # $ ./test-translator gripper:prob01.pddl depot:pfile1 # # Returns 0 on success and 1 if an error occurs.
msg2132 (view)	Author: malte	Date: 2012-04-27.10:55:43
Your choice, but it'd be good if it had a decent code coverage.
msg2131 (view)	Author: jendrik	Date: 2012-04-27.09:45:02
OK, then let's have a big and a small test. The big one runs all first problems now. Which problems shell the small one run?
msg2130 (view)	Author: malte	Date: 2012-04-25.13:59:26
OK, discussion on the mailing list seems to have petered out. The remaining TODO is to set up the tests for the buildbot. (As discussed on the list, this might require defining a simpler test to run on every commit in addition to the more complex ones to run daily.)
msg2129 (view)	Author: jendrik	Date: 2012-04-18.18:19:21
I pushed the code and sent the mail.
msg2128 (view)	Author: malte	Date: 2012-04-18.15:02:10
Hmmm, I wrote a long comment and somehow lost it. Shorter version: - Python 2.6 is indeed much slower than 2.7 or 3.2 here. I verified this locally on PSR-Large #50: 730 seconds with Python 2.6; 108 seconds with Python 2.7. This is not due to wrong time measurement. The performance difference is actually there. - Considering this, I would say that we should require Python >= 2.7 in the future. We should adjust the hash-bang lines, the new-scripts, the new-new-scripts, and add a warning to the code if it's manually called with an older Python. - Since upgrading to a new Python will make the translator much faster, I don't mind the 10% of performance loss we get through Python 3 compatibility, so the code for this issue can be merged. - The buildbot should automatically test Python 2.6, 2.7 and 3.x compatibility as a new test. This may require Erez to install 2.7 and 3.x. - All of the above, and also the background of this (that you ported the planner to 3.x and what the performance number are) should be discussed on downward-dev. Can you send a summary of everything to the list? - Once downward-dev is happy, we should send an announcement to the Google group.
msg2127 (view)	Author: jendrik	Date: 2012-04-18.13:50:06
Here are the results for python2.6 with a manually compiled python.
msg2126 (view)	Author: jendrik	Date: 2012-04-18.00:15:03
This may indeed be the cause of the strange times. Habakuk (and probably the cluster) has python 2.6.0 and the bug in os.times() (Issue #1040026: Fix os.times result on systems where HZ is incorrect.) was fixed in 2.6.2. I am running the experiment with 2.6.8 now, so we should be fine. The other versions are not affected either.
msg2125 (view)	Author: malte	Date: 2012-04-17.21:36:18
Another point to consider if runtime results are strange: Python had a bug in os.times() for a long time. It may be worth also measuring translator runtime externally with the bash "time" and/or "times" commands. ("time" is usually easier to work with.)
msg2124 (view)	Author: jendrik	Date: 2012-04-17.19:40:03
The 2.7 and 3.2 experiments have been done with manually compiled python versions. I'll rerun the 2.6 experiment with a manually compiled python as well (measuring also the memory).
msg2123 (view)	Author: jendrik	Date: 2012-04-17.19:15:02
Only for Python 2.7, not fpr Python 2.6. Shall I run the experiment again?
msg2122 (view)	Author: malte	Date: 2012-04-17.16:47:50
Do we have memory usage information for Python 2.x, too? (Can't check at the moment; I'm on the road and the attachments are too large for my connection.) If not, that would be another useful thing to add.
msg2121 (view)	Author: jendrik	Date: 2012-04-17.14:45:03
Here are the times and memory usage for python3.2. How do we proceed from here?
msg2119 (view)	Author: jendrik	Date: 2012-04-17.10:50:03
Yes, the two experiments were both run on opteron_core.q. I'll run another experiment for python3.x.
msg2118 (view)	Author: malte	Date: 2012-04-17.08:38:26
Are these on the same machines? I'd be interested in the 2.6 vs. 2.7 vs 3.x runtimes, but here the 2.7 runtimes appear to be more than three times smaller than the 2.6 ones (looking at "translator_time_done"), which looks a bit suspicious.
msg2116 (view)	Author: jendrik	Date: 2012-04-16.22:50:04
These results compare translator times and memory peaks for python2.7 (not python3.2 as the unfortunate naming js-translator3-2 might suggest, the 2 was supposed to stand for 2nd experiment). The numbers look very similar to the 2.6 results.
msg2109 (view)	Author: malte	Date: 2012-04-15.23:02:47
> > What are the numbers for memory? > > Those are not measured currently. How do you suppose we should measure them? One easy way is to follow the approach used in the search code; see get_peak_memory_in_kb in src/search/utilities.cc. This needs a test that we are on a Linux platform; if not, it is OK to report that peak memory numbers are unavailable.
msg2108 (view)	Author: jendrik	Date: 2012-04-15.21:15:03
> What are the numbers for memory? Those are not measured currently. How do you suppose we should measure them? > Can we get numbers for Python 2.7 and Python 3.2? We would have to compile those manually first for the cluster and write a more complicated experiment. Maybe we should have a look at the code again first. From what I see the parts that perform significantly worse are computing_fact_groups, detecting_unreachable_propositions and instantiating_groups and writing_output. I have tried looking at the code for the first three, but haven't been able to find possible changes that are responsible for the performance decline. Do you have any suspects that we could fix while remaining python3 compatible?
msg2107 (view)	Author: malte	Date: 2012-04-15.20:00:29
Hmmm, roughly 10% loss in performance is not trivial, unfortunately. What are the numbers for memory? Can we get numbers for Python 2.7 and Python 3.2?
msg2106 (view)	Author: jendrik	Date: 2012-04-15.13:40:04
Here are the scatter plots comparing the two code versions for python2.6.
msg2105 (view)	Author: malte	Date: 2012-04-13.18:09:17
Code review done.
msg2103 (view)	Author: jendrik	Date: 2012-04-13.17:35:02
I just fixed that, it now runs all available versions from the set {2.6, 2.7, 3}. > > Some of the changes look like they could affect performance on Python 2. Can we > do a before/after scatter-plot on translator time? I would like to wait until after the code review for that. If there are still more changes, we might have to do the same experiment twice...
msg2102 (view)	Author: malte	Date: 2012-04-13.17:00:26
(PS: Sounds like a lot of work, but once we have set something like this up, it would also be useful for other changes to the translator we might do in the future.)
msg2101 (view)	Author: malte	Date: 2012-04-13.16:59:56
Thanks! It may take me some time to look at this properly, unfortunately. It looks like the test script doesn't test Python 2.6? Some of the changes look like they could affect performance on Python 2. Can we do a before/after scatter-plot on translator time? It would be great to also do the same thing for memory usage, and maybe also a Python 2.6/2.7/3.x comparison on the same metrics.
msg2100 (view)	Author: jendrik	Date: 2012-04-13.16:55:03
I have added scripts that translate all first problems in each domain for the two python versions. Also I set up a Rietveld issue at http://codereview.appspot.com/6028046/.
msg2097 (view)	Author: malte	Date: 2012-04-13.12:00:23
Sounds good! It's easy to break compatibility, so it would be good if we had an automated way to test this. Something very simple (run the planner with all three python versions on some input and check that it completes) would be a good start.
msg2096 (view)	Author: jendrik	Date: 2012-04-13.11:49:27
We should support support all common python versions: 2.6, 2.7 and also 3.x As long as we don't use new python3 language features, it seems we can support both major versions with a single codebase.

History
Date	User	Action	Args
2012-05-01 12:09:07	jendrik	set	status: chatting -> resolved
2012-05-01 12:08:43	jendrik	set	superseder: + do not run translator with python2.6 messages: + msg2147
2012-04-29 14:24:17	malte	set	messages: + msg2145
2012-04-29 14:15:05	jendrik	set	messages: + msg2144
2012-04-29 14:15:03	jendrik	set	messages: + msg2143
2012-04-29 14:09:02	malte	set	messages: + msg2142
2012-04-29 13:35:54	malte	set	messages: + msg2141
2012-04-29 13:29:37	jendrik	set	messages: + msg2140
2012-04-29 13:20:10	malte	set	messages: + msg2139
2012-04-29 09:07:08	erez	set	messages: + msg2138
2012-04-27 12:09:47	jendrik	set	nosy: + erez messages: + msg2134
2012-04-27 12:05:02	jendrik	set	messages: + msg2133
2012-04-27 10:55:43	malte	set	messages: + msg2132
2012-04-27 09:45:03	jendrik	set	messages: + msg2131
2012-04-25 13:59:26	malte	set	messages: + msg2130
2012-04-18 18:19:22	jendrik	set	messages: + msg2129
2012-04-18 15:02:11	malte	set	messages: + msg2128
2012-04-18 13:50:06	jendrik	set	files: + js-translator-python3-2.6-eval.tar.gz messages: + msg2127
2012-04-18 13:49:19	jendrik	set	files: - unnamed
2012-04-18 13:48:32	jendrik	set	files: - js-translator-python3-eval.tar.gz
2012-04-18 00:15:03	jendrik	set	files: + unnamed messages: + msg2126
2012-04-17 21:36:18	malte	set	messages: + msg2125
2012-04-17 19:40:03	jendrik	set	messages: + msg2124
2012-04-17 19:15:02	jendrik	set	messages: + msg2123
2012-04-17 16:47:50	malte	set	messages: + msg2122
2012-04-17 14:45:03	jendrik	set	files: + js-translator-python3-3.2-eval.tar.gz messages: + msg2121
2012-04-17 10:50:03	jendrik	set	messages: + msg2119
2012-04-17 08:38:26	malte	set	messages: + msg2118
2012-04-16 22:50:04	jendrik	set	files: + js-translator-python3-2.7-eval.tar.gz messages: + msg2116
2012-04-15 23:02:47	malte	set	messages: + msg2109
2012-04-15 21:15:03	jendrik	set	messages: + msg2108
2012-04-15 20:00:29	malte	set	messages: + msg2107
2012-04-15 13:40:05	jendrik	set	files: + js-translator-python3-eval.tar.gz messages: + msg2106
2012-04-13 18:09:17	malte	set	messages: + msg2105
2012-04-13 17:35:02	jendrik	set	messages: + msg2103
2012-04-13 17:00:26	malte	set	messages: + msg2102
2012-04-13 16:59:56	malte	set	messages: + msg2101
2012-04-13 16:55:03	jendrik	set	messages: + msg2100
2012-04-13 12:00:23	malte	set	status: unread -> chatting messages: + msg2097
2012-04-13 11:49:27	jendrik	create

Issue329