Issue470

Title Large variable domain in visitall triggers assertion in cg heuristic
Priority bug Status resolved
Superseder Nosy List florian, malte
Assigned To malte Keywords
Optional summary

Created on 2014-09-18.18:00:54 by florian, last changed by florian.

Messages
msg3595 (view) Author: florian Date: 2014-09-29.08:13:44
Merged.
msg3594 (view) Author: malte Date: 2014-09-28.22:30:05
Great! :-) Looks good to merge.
msg3592 (view) Author: florian Date: 2014-09-28.21:49:16
Results for the cg heuristic look good at first glance:

http://ai.cs.unibas.ch/_tmp_files/pommeren/issue470-base-issue470-v1-cg-compare.html
msg3581 (view) Author: malte Date: 2014-09-28.17:19:18
I had a look at the iPDB differences. It's mainly caused by one task with a
close to perfect search time score before the change and a zero search time
score afterwards.

It spends all its time in generating the heuristic, which doesn't contribute to
the search time. In the baseline version, heuristic generation finished after
1700+ seconds, and then the task was solved quickly. In the new version
heuristic generation fails to finish. So this looks like a typical case where
it's bad not to set a timeout for the precomputation phase -- not something we
need to worry about here, but perhaps we should change our default ipdb config
to include a timeout. (But then we have the problem that the timeout should
really depend on the overall timeout -- perhaps something we should support in
the future when the new driver script is merged and we could support such things
on the Python level.)
msg3580 (view) Author: malte Date: 2014-09-28.17:09:13
Do the iPDB results look mildly worrying to you, too? Should we double-check
that we didn't adversely affect that configuration?
msg3577 (view) Author: florian Date: 2014-09-28.14:34:48
I updated the report. The remaining unexplained errors are caused by a different
failed assertion. I'll start a new issue for those.

http://ai.cs.unibas.ch/_tmp_files/pommeren/issue470-base-issue470-v1-compare.html

The second experiment is still running.
msg3575 (view) Author: malte Date: 2014-09-28.01:18:34
Ah, that's a good point. For what it's worth, where feasible we should always
try to merge in current default before experiments, at least if these are the
last intended experiments before merging. Otherwise we have no sanity test that
code does the right thing when it is merged into default.

> I will check if the errors are actually cause by this, or something else is
going on,
> tomorrow.

I wouldn't bother -- I think we should merge from default anyway and rerun the
experiments for this anyway, and then the experiments based on the "old"
issue470-base revision won't matter. Does this make sense?
msg3574 (view) Author: florian Date: 2014-09-28.00:58:10
Still heaps of unexplained errors probably caused by issue469, we should merge
in the current default. I will check if the errors are actually cause by this,
or something else is going on, tomorrow.

http://ai.cs.unibas.ch/_tmp_files/pommeren/issue470-base-issue470-v1-compare.html#unexplained-errors
msg3564 (view) Author: malte Date: 2014-09-27.19:02:54
Thanks! I should have been clearer: I meant pdb() and gapdb() literally, hence
the parentheses. (BTW, I don't care too much about being consistent about this
in the experiment code, but usually our convention for Python code is to use ""
for strings, not ''.)

I added the pdb() and gapdb() configs and the CG experiment. I also changed
"astar(ipdb)" to "astar(ipdb())" -- I'm not sure if it's a bug or a feature that
we allow omitting the parentheses in this case, but I see it more on the bug side.

Can you start the experiment? I think this needs to be in debug mode again
because the original CG issue was a failed assertion.
msg3562 (view) Author: florian Date: 2014-09-27.18:12:52
The diff looks fine to me. I started setting up an experiment and pushed it with
the necessary tags. So far, it only contains configs we have in lab
(merge-and-shrink and ipdb) can you add the configs for gapdb and pdb you
wanted? For the CG heuristic we need a second experiment, assuming you want to
evaluate on the satisficing benchmark suite. You can just copy the experiment I
added and change optimal to satisficing.
msg3481 (view) Author: malte Date: 2014-09-20.14:45:18
I had some free time today and had a stab at this.

I think this needs code review: it's easy to make mistakes there (I think I
fixed two off-by-one errors in the existing code, and in a review before my own
commit I noticed that I accidentally inverted a condition), and these issues
wouldn't necessarily be caught by regular experiments because some of them are
rare corner cases.

It's probably a good idea to perform an experiment too, to be on the safe side
(merge-and-shrink, CG heuristic, ipdb(), gapdb(), pdb()). Can you set up an
experiment or give me a pointer explaining how to best set one up? (Many things
have changed since I last did.)

The branch consists of two commits, shown here:
https://bitbucket.org/malte/downward/commits/branch/issue470
(Let me know if you would prefer a pull request.)
msg3449 (view) Author: malte Date: 2014-09-18.18:58:13
Hmmm, I hoped I would find time to do this now, but no such luck, so I'm
unassigning again.

I think the clean way to fix this is to guard against overflow by using division
in place of multiplication in the same way that this is done in the PDB and M&S
code.

However, this should be refactored to use a common utility method, such as "bool
can_multiply_safely(int a, int b)" which should assert a >= 0 and b >= 0 and
return true if a * b fits into an int (using the same technique as currently
done in the PDB and M&S code, and making sure it doesn't divide by 0).

Anyone can take care of this?

The affected places (apart from cg_cache.cc) can be found by searching for
"overflow" in the following files:

merge_and_shrink/shrink_fh.cc
merge_and_shrink/shrink_strategy.cc
pdbs/pattern_generation_edelkamp.cc
pdbs/pattern_generation_haslum.cc
pdbs/util.cc
msg3443 (view) Author: florian Date: 2014-09-18.18:00:54
Visitall tasks from visitall-sat11-strips starting from task problem32.pddl
trigger the following assertion in CGCache:

cg_cache.cc:25: CGCache::CGCache(): Assertion `g_variable_domain[var] <= 1000'
failed.
History
Date User Action Args
2014-09-29 08:13:45floriansetstatus: reviewing -> resolved
messages: + msg3595
2014-09-28 22:30:05maltesetmessages: + msg3594
2014-09-28 21:49:16floriansetmessages: + msg3592
2014-09-28 17:19:18maltesetmessages: + msg3581
2014-09-28 17:09:13maltesetmessages: + msg3580
2014-09-28 14:34:48floriansetmessages: + msg3577
2014-09-28 01:18:34maltesetmessages: + msg3575
2014-09-28 00:58:10floriansetmessages: + msg3574
2014-09-27 19:02:54maltesetmessages: + msg3564
2014-09-27 18:12:52floriansetmessages: + msg3562
2014-09-20 14:45:18maltesetstatus: chatting -> reviewing
assignedto: malte
messages: + msg3481
2014-09-18 18:58:13maltesetstatus: unread -> chatting
assignedto: malte -> (no value)
messages: + msg3449
2014-09-18 18:40:55maltesetassignedto: malte
2014-09-18 18:00:54floriancreate