Issue1061

Title segfault when generating pdbs for pho_constraints
Priority bug Status resolved
Superseder Nosy List augusto, florian, jendrik, malte, silvan, thomas
Assigned To Keywords
Optional summary

Created on 2022-08-10.20:22:11 by florian, last changed by thomas.

Messages
msg10846 (view) Author: thomas Date: 2022-09-12.15:16:36
Merged.

The quickfix proposed by Silvan in msg10806 was implemented to get rid of the segfault. The underlying issue is still unresolved and further discussed in issue1065.
msg10844 (view) Author: florian Date: 2022-09-12.10:59:12
I created issue1065 for the underlying problem and copied the nosy list from here. Of course feel free to un-nosy yourself if you are not interested.
msg10842 (view) Author: malte Date: 2022-09-12.10:51:26
Probably yes. The way I read the discussion, we hope that with the new component interaction model this will have to be addressed necessarily, but it's probably good to have an explicit record of this problem regardless to inform whatever decisions we make at that point.
msg10841 (view) Author: florian Date: 2022-09-12.10:46:39
I'm fine with merging the change like this as a hotfix.

But the underlying issue is not really solved by this. The log in the information object still becomes a dangling reference when the pattern generator is deleted. With the patch, we just do not access it anymore afterwards. Should we create a follow-up issue for a better solution?
msg10840 (view) Author: silvan Date: 2022-09-12.10:15:58
I have nothing to add.
msg10839 (view) Author: malte Date: 2022-09-12.09:53:47
Looks good to me.

I'd write the change log entry differently -- focusing on the user perspective, not the implementation details. It's enough to say that we fixed a crash there without mentioning how we changed the code to do it. But we always polish these when we do a release, so for me this looks good to merge.

Can one of the people that know more about this (perhaps Silvan?) also give a go-ahead?
msg10838 (view) Author: thomas Date: 2022-09-12.09:51:10
I have opened a pull request that implements the change proposed by Silvan in msg10806. The pull request can be found here:
https://github.com/aibasel/downward/pull/121
msg10808 (view) Author: silvan Date: 2022-08-11.22:19:02
The builder solution will at least force to think about how the log, if the information class should still have one, gets there - be it by moving from the generator, creating their own copy or something else.
msg10807 (view) Author: florian Date: 2022-08-11.16:42:18
I'm not sure how this interacts with the component interaction problem but it sounds like exactly the kind of problem that we discussed there. Would the problem go away after we switch everything to builders? If so, i think a short term solution like swapping the two lines would be sufficient for now.
msg10806 (view) Author: silvan Date: 2022-08-11.13:34:17
You are absolutely right, the problem is that the log of the information is a reference to the log of the generator. The pho code apparently is the only one deleting the generator after using it, see the following lines in pho_constraints.cc:

...
pattern_generator = nullptr;
pdbs = pattern_collection_info.get_pdbs();
...

Switching these lines would fix this particular error, but of course, this error will always occur when deleting the generator and still using the result of the generator, i.e., the information object. 

We could fix this by moving the log out of the generator, thus making the generator unusable afterwards. Currently, all generators produce results stored in shared pointers, handed to the information classes. Still, I'm sure the generators couldn't just be re-used without some effort after having them used once. 

We could also "share" the log like everything else which is produced by generators and handed to information classes, but as advertised since years (c.f. issue906, and also related, issue911), I'm really against this. I think I likely wouldn't have made this error if the information classes where transferring ownership, rather than keeping shared pointers to everything created from the generators. The generators should be one-time usable factories that create everything they want and then go out of scope.

Another solution could be to stop producing output in the information classes. This would probably be the easiest fix for the time being.
msg10805 (view) Author: florian Date: 2022-08-11.13:02:22
Silvan, I think you had contact with the logs and the pattern generation method. Could it be that the logger somehow goes out of scope and is destroyed? Of course it could also be a memory corruption at a different place in the code.

Using cpdbs with the same pattern collection works fine, so it has to be something related to the pho constraints. Maybe cpdbs create the PDBs earlier and the pho constraints delay the construction to a point where something is no longer available.
msg10804 (view) Author: florian Date: 2022-08-10.20:22:11
This call lead to a segfault
./fast-downward.py $DOWNWARD_BENCHMARKS/blocks/probBLOCKS-4-0.pddl --search "eager_greedy([operatorcounting([pho_constraints(patterns=systematic(2))])])"

GDB shows the stacktrace in this line https://github.com/aibasel/downward/blob/main/src/search/pdbs/pattern_collection_information.cc#L72
printing the char[45] (so not the timer in the line below) to the log. This is strange because printing the text in line 63 works fine.
History
Date User Action Args
2022-09-12 15:16:36thomassetstatus: reviewing -> resolved
messages: + msg10846
2022-09-12 10:59:12floriansetmessages: + msg10844
2022-09-12 10:51:26maltesetmessages: + msg10842
2022-09-12 10:46:39floriansetmessages: + msg10841
2022-09-12 10:15:58silvansetmessages: + msg10840
2022-09-12 10:09:42thomassetstatus: chatting -> reviewing
2022-09-12 09:53:47maltesetmessages: + msg10839
2022-09-12 09:51:10thomassetnosy: + thomas
messages: + msg10838
2022-08-11 22:19:02silvansetmessages: + msg10808
2022-08-11 16:42:18floriansetmessages: + msg10807
2022-08-11 16:18:00augustosetnosy: + augusto
2022-08-11 13:34:17silvansetmessages: + msg10806
2022-08-11 13:02:22floriansetstatus: unread -> chatting
messages: + msg10805
2022-08-10 22:41:43silvansetnosy: + silvan
2022-08-10 20:22:11floriancreate