Issue 762: complain about incorrect SAS file - Fast Downward issue tracker

Title	complain about incorrect SAS file
Priority	wish	Status	in-progress
Superseder		Nosy List	jendrik, malte, patfer
Assigned To	patfer	Keywords
Optional summary

Created on 2018-03-06.21:29:52 by jendrik, last changed by jendrik.

Files
File name	Uploaded	Type	Edit	Remove
rubiks_cube.sas	jendrik, 2018-03-06.21:29:51	application/octet-stream

Messages
msg10189 (view)	Author: jendrik	Date: 2021-03-18.14:23:35
Patrick, do you still plan to pursue this?
msg6987 (view)	Author: jendrik	Date: 2018-04-03.21:45:08
issue700 is now merged.
msg6957 (view)	Author: patfer	Date: 2018-03-27.08:25:49
Your comments are processed and then we wait for merging issue700.
msg6940 (view)	Author: jendrik	Date: 2018-03-17.10:02:07
I left some more comments on Bitbucket. I think it's best to wait with the next review until after issue700 has been merged into default and from there into the branch for this issue. Before merging the default into this branch, please address the comments on Bitbucket, though. Otherwise, they'll easily slip under the radar.
msg6937 (view)	Author: patfer	Date: 2018-03-16.21:16:11
I incorporated the comments. There are now two types of error output. 1. Parsing errors which show what is the error (Unable to parse next int) + Context (Context: mutex group X) 2. Value errors. Short sentence telling the value error similar to what check_magic is doing(Invalid number of operators in NAME.) The check for ordered conditions can now be enabled/disabled with a global flag which is hardcoded. I think there is no need for a commandline arguments. If at one point some techniques depent on the ordering, the user should not be able to disable it. The pull request for new comments: https://bitbucket.org/PatFer/issue762/pull-requests/1/pr-fix-issue762-rev2/diff
msg6915 (view)	Author: jendrik	Date: 2018-03-16.16:53:39
I left some comments on Bitbucket, but I think it's best to wait now until you've resolved the accumulated comments from me and Malte :-)
msg6898 (view)	Author: malte	Date: 2018-03-16.14:58:06
Interesting! The translator does perform the sorting upon creating the SASTask object, and it does contain validation code that verifies the invariants after constructing the object. This invariant check is disabled by default, but can be enabled by setting DEBUG = True in sas_tasks.py. I think the issue is that the SAS task is modified after it is first created, which breaks the order. It was intended to be conceptually immutable, but we don't enforce this, and I assume this broke when we integrated the former preprocessor into the translator, which reorders the variables after the task is created. The main purpose of sorting is not really finding things quickly, but rather iterating through the sequence in sorted order, which allows processing things in a consistent order and makes it easier to verify other properties like "cannot have two copies of the same proposition" or "cannot have two goals for the same variable". These are trivial to verify when sorted (using approaches like "is_sorted_unique" in utils/collections.h), but without sorting, verifying the properties is essentially as expensive as sorting, then verifying. Also, sorted order for things like conditions and effects helps provide a memory access pattern that works nicely with prefetching in CPU cache architectures. So on the balance, and considering we already go to the pain of sorting things in the translator, I think the canonical order of things is desirable. However, repairing the order is likely to affect performance of order-sensitive heuristics and other things like stubborn set algorithms, so will require substantial experiments. One possibility is to leave off the validation of sorting for now (e.g. disabling it with a global constant that we set to false for now), but opening an issue that looks into creating the desired order again. This should perhaps also add a SASTask.validate call in the translator (controlled by some debug flag) at the point where the task is written.
msg6894 (view)	Author: patfer	Date: 2018-03-16.14:39:33
I just detected that the test pipeline does not run successfully on the modified task. The problem is that the translator defines the sas file restrictions, but does neither follow nor check them itself. The current problem is the restriction that any conditions (prevail, or effect conditions or goal) and mutexes in a group are sorted by (var, value) and translator does not sort them. What are your thoughts on this? How much performance would it cost to sort the conditions in the translater? I think having them sorted is nice, because if you are looking for something, then you find it faster, BUT the sorting condition has no semantic influence (do we have tie breakings which could rely on it?) and I would drop this restriction.
msg6891 (view)	Author: patfer	Date: 2018-03-16.13:59:24
Unifying the phrasing makes sense. Just failing in read_next<>() is an improvement, as we are then checking the streams errors, but, the error messages would then look like: "Failed to read next int". Finding where the int was expected is difficult. I think let check_fact return a boolean would make more sense for me too. As developer I would expect it from the name to return an boolean value. The cost of this is to add a lot of code. At every check_fact() we need to add the error output and the execution stop. I would keep the level of detail: + If it fails now, the user knows relative easily why and where it failed. + It is less likely to fail silently and solve some nonsense. o Complexity: Except for some if statements (and sometimes an integer counter) the runtime should have no performance penalties. - The code becomes more bloated Indeed adding checks adds code. The methods used for checks are already stored in a separate header file.
msg6886 (view)	Author: jendrik	Date: 2018-03-16.12:47:41
I had a quick look at the pull request. If we want to keep this level of detail for the error messages, I think we should simplify and unify them a bit more. But first we need to decide whether we want this level of detail. I think that it should be enough to pass the input stream to read_next<>() and let it fail loudly if it cannot read the required type. I don't think we necessarily need to provide context of where the failure happened. For the range checks, I think we should let the checks return Booleans indicating whether the value is inside the expected range and use an if- statement to print an error if it isn't.
msg6873 (view)	Author: jendrik	Date: 2018-03-15.19:35:52
I haven't made any comments, yet. Will do so later tonight or tomorrow.
msg6871 (view)	Author: patfer	Date: 2018-03-15.18:51:23
A pull request for the fix can be found at: https://bitbucket.org/PatFer/issue762/pull-requests/1/issue762-fixed/diff Jendrik have you already made comments on the old pull request? Because by adding a new one for the new code, I cannot see the old pull request anymore.
msg6846 (view)	Author: malte	Date: 2018-03-15.11:01:41
Indeed, there is nothing wrong with tasks without operators, and we currently produce them in the translator if it detects unsolvability, for example for some tasks in the Mystery domain. The restrictions to at least one state variable, at least goal and domain sizes at least 2 are in some sense unnecessary, but as far as I recall we have decided some time ago to enforce them (for now) to make our life easier, especially as the search component input file format is not intended to be user-facing. It would make sense to lift the restriction at some point, but this does not have high priority. We already discussed what should or should not be a valid file in the context of the translator, and the current state of affairs is described in src/translate/sas_tasks.py. (It is spread over multiple doc comments.) If we want to validate the search component input more carefully, then we should ideally use exactly the same rules of validity that we define in the translator. (Of course, the rules defined there can be changed if we see that they are inadequate.) Also, because of DRY, what is or is not valid should be documented in exactly one place, and the other places that uses the same notion of validity should contain a pointer to this one authoritative place, so that in later edits we can see what the rules are.
msg6845 (view)	Author: jendrik	Date: 2018-03-15.08:48:30
I think tasks without operators should be allowed. Can you make a pull request and link to it, please?
msg6841 (view)	Author: patfer	Date: 2018-03-15.01:40:26
I reworked the parsing of the search component to perform checks if the strings could be correctly parsed and if the parsed values are valid. This solves this issue762. In the same run, I added the checks desired in issue458 and added more checks (see list below). This should resolve issue458. Furthermore, issue458 was related to issue54 and decided to document that derived variables can only be binary. The parsing checks that axioms are binary. We still have to add this to the website. First part: Parsing correctly strings to value With some safe exceptions all calls to cin >> variable are replaced by a function. The function performs cin >> variable AND checks the error bits of cin. If error bits are set, it outputs error (this explains which error istream had not where the error during parsing occured). The function can receive additionally to the mandatory arguments a list of arbitrary arguments and if an error occurred, the arguments are printed on cerr. this can be used to provide meaningful error messages (except for the if checks the complex part is handled at compile time and should not impact the runs). Checks after getline() error bits, too. Secondly, the parsed values are checked if they are valid in the context. That includes: -checking magic words (previously done) -num variables > 0 (issue458) -num goal > 0 (issue458) -values per variable > 1 (issue458) -num mutexes/axioms >= 0 -mutex group sizes > 1 -num operators > 0 -num previal conditions >= 0 -num effects >= 0 -value variables axiom layer property > -2 -for initial state & goal conditions & mutexes & prevail conditions & effects: the variable-value pairs are checked to be in bounds (if applicable -1 is also allowed) - Checks that input is empty after parsing It should now detect all invalid problem files and provide some error message (instead of SegFaults), but there are two restrictions on the output: A) The operator/axiom parsing is done in another class (constructor of Operator, Condition and Effect) and thus, on error we do not know anymore in which operator/condition/effect we are (for Variables we can use the iteration variable to identiy malicious item in the sas file). B) For efficiency reasons we have not switched to a line by line parsing. Thus, on blocks defining variables, we do not know how the parsed integers are arranged (e.g. if one variable-value pairs per line is expected, we do not recognize if one value is missing, but just take the value from the next line. This is detected latest at the next magic word.)
msg6818 (view)	Author: jendrik	Date: 2018-03-06.21:31:34
This is related to issue458. While issue458 deals with degenerate inputs, this issues deals with incorrectly formatted input.
msg6817 (view)	Author: jendrik	Date: 2018-03-06.21:29:51
We found that the attached SAS file (simplified) is accepted by Fast Downward even though it is missing the number of operators. This leads to no operators being parsed. The planner should loudly complain about the format being wrong.

History
Date	User	Action	Args
2021-03-18 14:23:35	jendrik	set	messages: + msg10189
2018-04-03 21:45:08	jendrik	set	status: reviewing -> in-progress messages: + msg6987
2018-03-27 08:25:49	patfer	set	messages: + msg6957
2018-03-17 10:02:07	jendrik	set	messages: + msg6940
2018-03-16 21:16:11	patfer	set	messages: + msg6937
2018-03-16 16:53:39	jendrik	set	messages: + msg6915
2018-03-16 14:58:06	malte	set	messages: + msg6898
2018-03-16 14:39:33	patfer	set	messages: + msg6894
2018-03-16 13:59:24	patfer	set	messages: + msg6891
2018-03-16 12:47:41	jendrik	set	messages: + msg6886
2018-03-15 19:35:52	jendrik	set	messages: + msg6873
2018-03-15 18:51:23	patfer	set	messages: + msg6871
2018-03-15 11:01:41	malte	set	messages: + msg6846
2018-03-15 08:48:30	jendrik	set	status: chatting -> reviewing messages: + msg6845
2018-03-15 01:40:26	patfer	set	assignedto: patfer messages: + msg6841 nosy: + patfer
2018-03-06 21:31:35	jendrik	set	status: unread -> chatting messages: + msg6818
2018-03-06 21:29:52	jendrik	create

Issue762