Issue762

Title complain about incorrect SAS file
Priority wish Status in-progress
Superseder Nosy List jendrik, malte, patfer
Assigned To patfer Keywords
Optional summary

Created on 2018-03-06.21:29:52 by jendrik, last changed by jendrik.

Files
File name Uploaded Type Edit Remove
rubiks_cube.sas jendrik, 2018-03-06.21:29:51 application/octet-stream
Messages
msg10189 (view) Author: jendrik Date: 2021-03-18.14:23:35
Patrick, do you still plan to pursue this?
msg6987 (view) Author: jendrik Date: 2018-04-03.21:45:08
issue700 is now merged.
msg6957 (view) Author: patfer Date: 2018-03-27.08:25:49
Your comments are processed and then we wait for merging issue700.
msg6940 (view) Author: jendrik Date: 2018-03-17.10:02:07
I left some more comments on Bitbucket. I think it's best to wait with the next 
review until after issue700 has been merged into default and from there into the 
branch for this issue. Before merging the default into this branch, please 
address the comments on Bitbucket, though. Otherwise, they'll easily slip under 
the radar.
msg6937 (view) Author: patfer Date: 2018-03-16.21:16:11
I incorporated the comments.
There are now two types of error output.
1. Parsing errors which show what is the error (Unable to parse next int) +
Context (Context: mutex group X)
2. Value errors. Short sentence telling the value error similar to what
check_magic is doing(Invalid number of operators in NAME.)

The check for ordered conditions can now be enabled/disabled with a global flag
which is hardcoded. I think there is no need for a commandline arguments. If at
one point some techniques depent on the ordering, the user should not be able to
disable it.

The pull request for new comments:
https://bitbucket.org/PatFer/issue762/pull-requests/1/pr-fix-issue762-rev2/diff
msg6915 (view) Author: jendrik Date: 2018-03-16.16:53:39
I left some comments on Bitbucket, but I think it's best to wait now until you've 
resolved the accumulated comments from me and Malte :-)
msg6898 (view) Author: malte Date: 2018-03-16.14:58:06
Interesting!

The translator does perform the sorting upon creating the SASTask object, and it
does contain validation code that verifies the invariants after constructing the
object. This invariant check is disabled by default, but can be enabled by
setting DEBUG = True in sas_tasks.py.

I think the issue is that the SAS task is modified after it is first created,
which breaks the order. It was intended to be conceptually immutable, but we
don't enforce this, and I assume this broke when we integrated the former
preprocessor into the translator, which reorders the variables after the task is
created.

The main purpose of sorting is not really finding things quickly, but rather
iterating through the sequence in sorted order, which allows processing things
in a consistent order and makes it easier to verify other properties like
"cannot have two copies of the same proposition" or "cannot have two goals for
the same variable". These are trivial to verify when sorted (using approaches
like "is_sorted_unique" in utils/collections.h), but without sorting, verifying
the properties is essentially as expensive as sorting, then verifying. Also,
sorted order for things like conditions and effects helps provide a memory
access pattern that works nicely with prefetching in CPU cache architectures.

So on the balance, and considering we already go to the pain of sorting things
in the translator, I think the canonical order of things is desirable. However,
repairing the order is likely to affect performance of order-sensitive
heuristics and other things like stubborn set algorithms, so will require
substantial experiments.

One possibility is to leave off the validation of sorting for now (e.g.
disabling it with a global constant that we set to false for now), but opening
an issue that looks into creating the desired order again. This should perhaps
also add a SASTask.validate call in the translator (controlled by some debug
flag) at the point where the task is written.
msg6894 (view) Author: patfer Date: 2018-03-16.14:39:33
I just detected that the test pipeline does not run successfully on the modified
task. The problem is that the translator defines the sas file restrictions, but
does neither follow nor check them itself.

The current problem is the restriction that any conditions (prevail, or effect
conditions or goal) and mutexes in a group are sorted by (var, value) and
translator does not sort them.

What are your thoughts on this? How much performance would it cost to sort the
conditions in the translater?

I think having them sorted is nice, because if you are looking for something,
then you find it faster, BUT the sorting condition has no semantic influence (do
we have tie breakings which could rely on it?) and I would drop this restriction.
msg6891 (view) Author: patfer Date: 2018-03-16.13:59:24
Unifying the phrasing makes sense.

Just failing in read_next<>() is an improvement, as we are then checking the
streams errors, but, the error messages would then look like: "Failed to read
next int". Finding where the int was expected is difficult.

I think let check_fact return a boolean would make more sense for me too. As
developer I would expect it from the name to return an boolean value. The cost
of this is to add a lot of code. At every check_fact() we need to add the error
output and the execution stop.

I would keep the level of detail:
+ If it fails now, the user knows relative easily why and where it failed.
+ It is less likely to fail silently and solve some nonsense.
o Complexity: Except for some if statements (and sometimes an integer counter)
the runtime should have no performance penalties.
- The code becomes more bloated
    Indeed adding checks adds code. The methods used for checks are already
stored in a separate header file.
msg6886 (view) Author: jendrik Date: 2018-03-16.12:47:41
I had a quick look at the pull request. If we want to keep this level of detail 
for the error messages, I think we should simplify and unify them a bit more. 
But first we need to decide whether we want this level of detail.

I think that it should be enough to pass the input stream to read_next<>() and 
let it fail loudly if it cannot read the required type. I don't think we 
necessarily need to provide context of where the failure happened.

For the range checks, I think we should let the checks return Booleans 
indicating whether the value is inside the expected range and use an if-
statement to print an error if it isn't.
msg6873 (view) Author: jendrik Date: 2018-03-15.19:35:52
I haven't made any comments, yet. Will do so later tonight or tomorrow.
msg6871 (view) Author: patfer Date: 2018-03-15.18:51:23
A pull request for the fix can be found at:
https://bitbucket.org/PatFer/issue762/pull-requests/1/issue762-fixed/diff

Jendrik have you already made comments on the old pull request? Because by
adding a new one for the new code, I cannot see the old pull request anymore.
msg6846 (view) Author: malte Date: 2018-03-15.11:01:41
Indeed, there is nothing wrong with tasks without operators, and we currently
produce them in the translator if it detects unsolvability, for example for some
tasks in the Mystery domain.

The restrictions to at least one state variable, at least goal and domain sizes
at least 2 are in some sense unnecessary, but as far as I recall we have decided
some time ago to enforce them (for now) to make our life easier, especially as
the search component input file format is not intended to be user-facing. It
would make sense to lift the restriction at some point, but this does not have
high priority.

We already discussed what should or should not be a valid file in the context of
the translator, and the current state of affairs is described in
src/translate/sas_tasks.py. (It is spread over multiple doc comments.) If we
want to validate the search component input more carefully, then we should
ideally use exactly the same rules of validity that we define in the translator.
(Of course, the rules defined there can be changed if we see that they are
inadequate.)

Also, because of DRY, what is or is not valid should be documented in exactly
one place, and the other places that uses the same notion of validity should
contain a pointer to this one authoritative place, so that in later edits we can
see what the rules are.
msg6845 (view) Author: jendrik Date: 2018-03-15.08:48:30
I think tasks without operators should be allowed.

Can you make a pull request and link to it, please?
msg6841 (view) Author: patfer Date: 2018-03-15.01:40:26
I reworked the parsing of the search component to perform checks if the strings
could be correctly parsed and if the parsed values are valid. This solves this
issue762. In the same run, I added the checks desired in issue458 and added more
checks (see list below). This should resolve issue458. Furthermore, issue458 was
related to issue54 and decided to document that derived variables can only be
binary. The parsing checks that axioms are binary. We still have to add this to
the website.

First part: Parsing correctly strings to value
With some safe exceptions all calls to cin >> variable are replaced by a
function. The function performs cin >> variable AND checks the error bits of
cin. If error bits are set, it outputs error (this explains which error istream
had not where the error during parsing occured).
The function can receive additionally to the mandatory arguments a list of
arbitrary arguments and if an error occurred, the arguments are printed on cerr.
this can be used to provide meaningful error messages (except for the if checks
the complex part is handled at compile time and should not impact the runs).
Checks after getline() error bits, too.

Secondly, the parsed values are checked if they are valid in the context. That
includes:
-checking magic words (previously done)
-num variables > 0 (issue458)
-num goal > 0 (issue458)
-values per variable > 1 (issue458)

-num mutexes/axioms >= 0
  -mutex group sizes > 1
-num operators > 0
  -num previal conditions >= 0
  -num effects >= 0

-value variables axiom layer property > -2
-for initial state & goal conditions & mutexes & prevail conditions & effects:
   the variable-value pairs are checked to be in bounds (if applicable -1 is
also allowed)
- Checks that input is empty after parsing

It should now detect all invalid problem files and provide some error message
(instead of SegFaults), but there are two restrictions on the output:
A) The operator/axiom parsing is done in another class (constructor of Operator,
 Condition and Effect) and thus, on error we do not know anymore in which
operator/condition/effect we are (for Variables we can use the iteration
variable to identiy malicious item in the sas file).
B) For efficiency reasons we have not switched to a line by line parsing. Thus,
on blocks defining variables, we do not know how the parsed integers are
arranged (e.g. if one variable-value pairs per line is expected, we do not
recognize if one value is missing, but just take the value from the next line.
This is detected latest at the next magic word.)
msg6818 (view) Author: jendrik Date: 2018-03-06.21:31:34
This is related to issue458. While issue458 deals with *degenerate* inputs, this 
issues deals with *incorrectly formatted* input.
msg6817 (view) Author: jendrik Date: 2018-03-06.21:29:51
We found that the attached SAS file (simplified) is accepted by Fast Downward 
even though it is missing the number of operators. This leads to no operators 
being parsed. The planner should loudly complain about the format being wrong.
History
Date User Action Args
2021-03-18 14:23:35jendriksetmessages: + msg10189
2018-04-03 21:45:08jendriksetstatus: reviewing -> in-progress
messages: + msg6987
2018-03-27 08:25:49patfersetmessages: + msg6957
2018-03-17 10:02:07jendriksetmessages: + msg6940
2018-03-16 21:16:11patfersetmessages: + msg6937
2018-03-16 16:53:39jendriksetmessages: + msg6915
2018-03-16 14:58:06maltesetmessages: + msg6898
2018-03-16 14:39:33patfersetmessages: + msg6894
2018-03-16 13:59:24patfersetmessages: + msg6891
2018-03-16 12:47:41jendriksetmessages: + msg6886
2018-03-15 19:35:52jendriksetmessages: + msg6873
2018-03-15 18:51:23patfersetmessages: + msg6871
2018-03-15 11:01:41maltesetmessages: + msg6846
2018-03-15 08:48:30jendriksetstatus: chatting -> reviewing
messages: + msg6845
2018-03-15 01:40:26patfersetassignedto: patfer
messages: + msg6841
nosy: + patfer
2018-03-06 21:31:35jendriksetstatus: unread -> chatting
messages: + msg6818
2018-03-06 21:29:52jendrikcreate