Issue 740: make translator output file configurable

Title	make translator output file configurable
Priority	wish	Status	resolved
Superseder		Nosy List	guillem, jendrik, malte, patrick.luehne
Assigned To	guillem	Keywords	easy, translator
Optional summary

Created on 2017-10-25.16:10:43 by patrick.luehne, last changed by guillem.

Messages
msg8075 (view)	Author: guillem	Date: 2018-11-08.12:15:43
This is merged now.
msg8048 (view)	Author: guillem	Date: 2018-11-05.09:22:41
Oops sorry, had gone completely off my radar, let me have a try at the merge later.
msg8046 (view)	Author: malte	Date: 2018-11-04.05:03:06
Guillem, is this still on the radar? Anything we can do to help get it merged?
msg7752 (view)	Author: malte	Date: 2018-09-21.16:40:40
Looks good to me, feel free to merge! (Sorry for letting this sit so long.)
msg6834 (view)	Author: jendrik	Date: 2018-03-14.17:47:43
I made the aforementioned change (rm -f output.sas) in Lab.
msg6811 (view)	Author: jendrik	Date: 2018-02-19.14:33:32
No need for that, I think. I'll add it once this issue is merged.
msg6810 (view)	Author: guillem	Date: 2018-02-19.14:28:38
Makes sense, yes. Do you want me to open a PR for lab with that (the simple addition of the -f flag)?
msg6809 (view)	Author: jendrik	Date: 2018-02-19.13:07:10
The old comments are preserved in the "Activity" tab on Bitbucket. I left some minor comments over there. Regarding Lab, I think it's best to just use "rm -f output.sas". When a user passes "--sas-task myfile" the SAS file is probably needed and shouldn't be removed.
msg6808 (view)	Author: guillem	Date: 2018-02-19.12:29:57
Thanks Jendrik - I've updated the PR accordingly. Not sure if updating a PR in bitbucket removes the old comments? Hope not. BTW this change affects LAB as well. Namely, the constructor of FastDownwardExperiment (lab/experiment.py:205) adds a hardcoded "rm output.sas" which should be modified whenever the --output-sas option is used. Do you want me to open an issue on that? Or we can have a look at it together this week.
msg6806 (view)	Author: jendrik	Date: 2018-02-16.16:27:16
I left some comments on Bitbucket.
msg6795 (view)	Author: malte	Date: 2018-02-09.10:39:21
I can have a look next week, after I'm done with my grant proposal. If I don't write again before Friday, feel free to remind me.
msg6794 (view)	Author: guillem	Date: 2018-02-09.10:33:17
The diff for this issue is here: https://bitbucket.org/gfrances/downward-issue740/pull-requests/1/issue740/diff I've tested this with a few configurations, including invoking only the translator component, invoking different aliases, standard search invokation, plus a portfolio configuration, running those on a few instances and searching for all *.sas in the workspace. Just for the record, I temporarily disabled the removal of "output.sas" in LAB's code, so that I could check the exact filenames being used. Everything works as expected. Malte / Jendrik, let me know if you'd test any other particular thing, or if this looks good to you to be merged into the default branch. Patrick, perhaps you'd like to test this as well once we merge it into the main repo. G.
msg6793 (view)	Author: patrick.luehne	Date: 2018-02-08.10:56:28
@malte: Thanks for your thorough explanations! Please know that my points weren’t meant as criticism; I just wanted my feedback to be heard :). I totally understand that the FastDownward team have many priorities and that you all had to put lots of thought into how to make things efficient and effective research-wise. As a “user” of the translator component, I had been wondering about some of the decisions, and your explanations clarified these a lot. @guillem: And thanks also for your current work on the --sas-file option :).
msg6792 (view)	Author: malte	Date: 2018-02-07.17:34:46
Looks good to me! Make sure that it also works when using portfolio configurations. I'm not sure they use the same code path to call the search component.
msg6791 (view)	Author: guillem	Date: 2018-02-07.17:27:48
I'm having a look at this. Basically, after the few changes I'm implementing, the behavior should be as follows: * The translate.py script has an option "--sas-file FILENAME" that allows to specify the filename where the translated SAS task will be placed. This can of course be a relative or an absolute path. This allows the translate output to be customized even if only the translator is needed. * The fast-downward.py main script has an option with the same name and semantics; if specified, it passes it down to the translator AND uses it to stdin-feed it to the search component. Malte/Jendrik, is there any particular or non-standard use case that you'd like to test?
msg6789 (view)	Author: malte	Date: 2018-02-06.17:53:19
> @malte: I didn’t suggest using exit codes exclusively for error reporting. > What I intended to say was that only the exit code could be used to check > whether the program finished its job successfully. If not, the error > output is still available in addition to the exit code to figure out what > went wrong. We often have gigabytes of stdout output in an experiment -- up to several megabytes per planner run, multiplied by perhaps 10000 planner runs. But the stderr output is short and sweet, almost always nothing and rarely more than 100 bytes, short enough to be included in the overall experiment report. > In this way, noncritical output printed to stderr could simply be ignored > if the exit code was 0, in which case the (well-formed) result can be used > by the next tool in the chain. Sure, that's what a modern logging system with explicit log levels like debug, warning, error, critical would enable. That would be nice to have, and that's part of what I meant about modernizing the logging system in the first paragraph of msg6785. But at the current moment, it would make our lives much more difficult if the critical stuff we need to see is buried in gigabytes of error output. Even if we had a log facility where errors and other output can be distinguished at a glance, "grep '^error:' */.log" can take ages to sift through gigabytes of data, where "ls */.err" will be very fast. Roughly speaking, we have three types of output: 1) Actual results, like the plans that are produced. 2) Information about the progress of the planning process, which are mainly used for automatic consumption by scripts that run planner experiments. 3) Information about fatal errors that occurred during the planner process. For you as a user, separation of 1+2 is important, which is why you like to be able to see only 1 on stdout. (Currently, 1 can only be separated by going through a file.) This makes a lot of sense. But for us who are mainly interested in the planning process itself, separation of 2+3 is much more important because we don't really care about the plans; our main "outcomes" are the things that the planner writes to stdout. So for us it would be quite undesirable to combine 2+3 in the same stream. At the moment, the main people who run the planner are other planning researchers who are interested in the planning process itself, not "real users", so this is the audience we cater for primarily. But of course it would be better to cater to both audiences well.
msg6788 (view)	Author: patrick.luehne	Date: 2018-02-06.17:27:10
@malte: I didn’t suggest using exit codes exclusively for error reporting. What I intended to say was that only the exit code could be used to check whether the program finished its job successfully. If not, the error output is still available in addition to the exit code to figure out what went wrong. In this way, noncritical output printed to stderr could simply be ignored if the exit code was 0, in which case the (well-formed) result can be used by the next tool in the chain. However, I believe that you have reasoned about error reporting a lot already. I was just curious to know how you handle this, because I believe that existing output on stderr isn’t considered an error by itself with most existing tools :).
msg6787 (view)	Author: malte	Date: 2018-02-06.17:19:07
> Just out of curiosity, why is output to stderr interpreted as actual errors? > Couldn’t this be handled by looking at whether the exit code was 0 (success)? > I believe that’s a more common convention :). Couldn’t this be handled by looking at whether the exit code was 0 (success)? > I believe that’s a more common convention :). Setting the exit code to non-zero on error is not mutually exclusive with reporting errors on stderr. :-) All basic unix tools like ls, cp, cat etc. write something to stderr iff they return a non-zero exit code. Of course we use exit codes, too. But it's hard to communicate exactly what went wrong in a single number when errors can occur in the driver, in the translator, in the search component, in different parts of a portfolio etc. In automated experiments, we want to be able to tell the experimenter what went wrong if there is something that needs their attention.
msg6786 (view)	Author: patrick.luehne	Date: 2018-02-06.16:48:48
@malte: Thanks for the explanations :). Making more use of libraries indeed sounds like a good long-term plan! Just out of curiosity, why is output to stderr interpreted as actual errors? Couldn’t this be handled by looking at whether the exit code was 0 (success)? I believe that’s a more common convention :).
msg6785 (view)	Author: malte	Date: 2018-02-06.16:45:09
Hi Patrick, I think several common conventions for stdout vs. stderr exist, and unfortunately we already follow a different one (not just in the translator but in the planner as a whole): in our planner runs, every output to stderr is considered an actual error or failure that requires investigation. One thing we might be able to do in the future to support your request is make logging more configurable in general, but for now logging improvements are more likely to happen in the search component than in the translator component. We have "librarification" of the planner as a longer-term goal that we started actively working on recently, but so far it's not really clear when or even if it will happen.
msg6784 (view)	Author: patrick.luehne	Date: 2018-02-06.16:36:10
@guillem, @malte: Thanks for tackling this issue! I’m sorry that I didn’t find the time to come back to this issue, but I really appreciate that this is being worked on :). I agree that making the output filename configurable solves part of the problem, and it would be a desirable option generally. I still think that using “-” as an output filename instead of /dev/stdout would be convenient. Considering @malte’s argument A, I think that all output other than the actually translated file should go to stderr. This is how many tools dinstinguish the actual output (on stdout) from additional runtime information, warnings, errors, etc. (on stderr). Also, I don’t believe that it will be problematic to have this kind of output directed to stderr, as I don’t expect it to scale with the actual output. What concerns using pipes, I’d say that the fact that the translated representation exists multiple times in memory is to be somewhat expected in cases where tools reside in separate binaries. Generally, I think that it’d be best to directly access the original representation from tools using it by providing a library. In this way, there would be no need for pipes, no necessary duplicates of the translated representation, or alternative “hacks” required. However, I believe that the current architecture doesn’t allow for this, and that this would require lots of work. Still, I think that the output filename option would be a cool thing to have for the time being :).
msg6783 (view)	Author: guillem	Date: 2018-02-06.16:21:28
I'll be tackling this issue in the next days, just to avoid any redundant work :-)
msg6765 (view)	Author: malte	Date: 2018-01-03.20:40:51
I have spun off the part regarding changing the behaviour of the driver script to not produce a temporary file into issue758, copying the nosy list from this issue. This issue is now only about making the translator output file name configurable. I have renamed it accordingly. I think neither of the two issues are an enormous amount of work, but I won't be able to look into them myself in the near future.
msg6573 (view)	Author: malte	Date: 2017-10-26.10:42:56
> The possible solutions I currently have in mind involve being able to > configure the output file name in the translator and using either named pipes > or the /dev/fd... pseudo-files inside the driver script to avoid using actual > temporary files. Addendum: after some more reading, the /dev/fd... solution is probably better because file descriptors cannot be externally interfered with and don't need to be cleaned up manually. I also expect it to be more portable in the Unix world, but that needs further investigation.
msg6572 (view)	Author: malte	Date: 2017-10-26.10:29:50
I moved the original issue report from "optional summary" to "change note". "Change note" is meant for messages, including the original message to open an issue. "Optional summary" is an editable field that is intended for TODO lists and similar things while we're working on the issue. Sorry for the unintuitive interface! One of these days we should improve it, but so much to do, so little time. :-)
msg6571 (view)	Author: malte	Date: 2017-10-26.10:28:15
[original comment by Patrick Lühne:] Dear FastDownward maintainers, At our research group at the University of Potsdam, we are using FastDownward’s translator (with --translate) successfully as a very useful preprocessing step in our planning projects :). At the time being, FastDownward’s translator produces an output.sas file, which we later parse in our applications for further processing. However, printing the translator output to files has multiple drawbacks. 1. We can’t simply pipe FastDownward’s output within our toolchain like this: fast-downward --translate [<files>] \| planning-tool 2. When we run two FastDownward processes simultaneously in the same working directory, the same file is read and written to by the two processes without synchronization, resulting in all kinds of issues. 3. The working directory is polluted with files that need to be cleaned up later. 4. FastDownward’s translator fails when run from a directory without write permissions for the current user. For this reason, I would like to know whether it would make sense to have FastDownward print all output to stdout by default instead to files? I know that there are cases where multiple output files are produced. Perhaps, it would, thus, be best to have options to redirect output as intended by the user. For instance, --output=output.sas would print the output to a file, while --output=- would print to stdout (using a dash is a convention used by many programs [1]). And --output-<alternative> would configure the output of some other stream, for instance, the output after postprocessing. As explained above, our group could really make use of such a feature. Do you have plans to implement something along these lines in the future? If you currently don’t have the resources to implement this, I could certainly lend a hand :). I haven’t seen any issue like this on the issue tracker before, so I apologize if a similar idea has been proposed before. Many thanks, Patrick [1] https://unix.stackexchange.com/a/16364
msg6570 (view)	Author: malte	Date: 2017-10-26.10:09:27
Dear Patrick, thanks for the very thoughtfully formulated feature request! We have thought about this general problem in the past, for all the reasons you mention. We want to implement something along those lines, but we are not yet clear on the details. Some considerations: A. If we use stdout for output, the translated problem file will be mixed with the regular output of the translator, so whichever script (or user) is calling the translator will have to take them apart again. In some use cases, a --silent setting to suppress log output could address this, but in others we would need to do something more complex, such as making log output configurable, too. B. One drawback of using pipes and related approaches that are based on the output and input process running concurrently is that at some point near the end of execution of the translator, the whole translated task will still be in memory for the translator (which is just about to finish writing it), and it will also be in memory for the search component or other tool following the translator (which is just about to finish reading it). Instead of using pipes directly, one can also capture and then feed forward the output, but again we would have a duplicate copy of the data. In some planning tasks, for example from the Planning and Learning competitions, this can be a substantial issue. To be clear, these are not meant as arguments against the proposed feature, just as considerations in selecting an implementation strategy. The possible solutions I currently have in mind involve being able to configure the output file name in the translator and using either named pipes or the /dev/fd... pseudo-files inside the driver script to avoid using actual temporary files. (In a bash script, I think the cleanest solution would be to use process substitution, and from the bash documentation it looks like process substitution is internally implemented with named pipes and/or /dev/fd..., depending on what the operating system supports.) This addresses point A., but not B. I don't know a good way to address B. I'm not sure if we want to implement the "-" convention because at the moment I don't want to encourage using stdout for the output task due to point A above. But note that even if "-" is not supported, you can use /dev/stdout with any program instead. (We usually use Linux. After reading a bit, I'm not sure if this is supported on a Mac, but there may be more portable alternatives. If so, I'd be very interested in hearing more. For Windows, unfortunately I don't know if there is an easy way of doing this kind of thing.) So in terms of a practical TODO list, there would be two steps to implement: 1. Make the translator output file name configurable. 2. Change the plumbing inside the driver script so that a full planner run by default does not write intermediate files to disk, does not need write access to the current directory, and does not rely on exclusive use of a given file name. Step 1. is probably quite a bit easier than step 2., and it may already be sufficient to support your use case. So we could try to do that one soon, and work on 2. if/when someone has a bit of time to spare. Would that work for you?

History
Date	User	Action	Args
2018-11-08 12:15:43	guillem	set	status: reviewing -> resolved messages: + msg8075
2018-11-05 09:22:41	guillem	set	messages: + msg8048
2018-11-04 05:03:06	malte	set	messages: + msg8046
2018-09-21 16:40:40	malte	set	messages: + msg7752
2018-03-14 17:47:43	jendrik	set	messages: + msg6834
2018-02-19 14:33:32	jendrik	set	messages: + msg6811
2018-02-19 14:28:38	guillem	set	messages: + msg6810
2018-02-19 13:07:10	jendrik	set	messages: + msg6809
2018-02-19 12:29:57	guillem	set	messages: + msg6808
2018-02-16 16:27:16	jendrik	set	messages: + msg6806
2018-02-09 10:39:21	malte	set	messages: + msg6795
2018-02-09 10:33:17	guillem	set	status: in-progress -> reviewing messages: + msg6794
2018-02-08 10:56:28	patrick.luehne	set	messages: + msg6793
2018-02-07 17:34:46	malte	set	messages: + msg6792
2018-02-07 17:27:48	guillem	set	status: chatting -> in-progress messages: + msg6791
2018-02-06 17:53:19	malte	set	messages: + msg6789
2018-02-06 17:27:10	patrick.luehne	set	messages: + msg6788
2018-02-06 17:19:07	malte	set	messages: + msg6787
2018-02-06 16:48:48	patrick.luehne	set	messages: + msg6786
2018-02-06 16:45:09	malte	set	messages: + msg6785
2018-02-06 16:36:10	patrick.luehne	set	messages: + msg6784
2018-02-06 16:21:28	guillem	set	assignedto: guillem messages: + msg6783 nosy: + guillem
2018-01-03 20:40:52	malte	set	messages: + msg6765 keyword: + easy, translator title: Print translator output to stdout and not to file -> make translator output file configurable
2017-10-26 11:49:52	jendrik	set	nosy: + jendrik
2017-10-26 10:42:56	malte	set	messages: + msg6573
2017-10-26 10:29:50	malte	set	messages: + msg6572
2017-10-26 10:28:15	malte	set	messages: + msg6571 summary: Dear FastDownward maintainers, At our research group at the University of Potsdam, we are using FastDownward’s translator (with --translate) successfully as a very useful preprocessing step in our planning projects :). At the time being, FastDownward’s translator produces an output.sas file, which we later parse in our applications for further processing. However, printing the translator output to files has multiple drawbacks. 1. We can’t simply pipe FastDownward’s output within our toolchain like this: fast-downward --translate [<files>] \| planning-tool 2. When we run two FastDownward processes simultaneously in the same working directory, the same file is read and written to by the two processes without synchronization, resulting in all kinds of issues. 3. The working directory is polluted with files that need to be cleaned up later. 4. FastDownward’s translator fails when run from a directory without write permissions for the current user. For this reason, I would like to know whether it would make sense to have FastDownward print all output to stdout by default instead to files? I know that there are cases where multiple output files are produced. Perhaps, it would, thus, be best to have options to redirect output as intended by the user. For instance, --output=output.sas would print the output to a file, while --output=- would print to stdout (using a dash is a convention used by many programs [1]). And --output-<alternative> would configure the output of some other stream, for instance, the output after postprocessing. As explained above, our group could really make use of such a feature. Do you have plans to implement something along these lines in the future? If you currently don’t have the resources to implement this, I could certainly lend a hand :). I haven’t seen any issue like this on the issue tracker before, so I apologize if a similar idea has been proposed before. Many thanks, Patrick [1] https://unix.stackexchange.com/a/16364 ->
2017-10-26 10:09:27	malte	set	status: unread -> chatting nosy: + malte, patrick.luehne messages: + msg6570
2017-10-25 16:10:43	patrick.luehne	create

Issue740