Issue1018

Title Switch hash effects of AbstractOperator (PatternDatabase) to int
Priority bug Status resolved
Superseder Nosy List florian, jendrik, malte, silvan
Assigned To silvan Keywords
Optional summary

Created on 2021-04-29.11:51:36 by silvan, last changed by silvan.

Messages
msg10282 (view) Author: silvan Date: 2021-05-04.10:24:48
Merged. Thanks for the review!
msg10280 (view) Author: malte Date: 2021-05-03.13:18:21
Ready to merge from my side.
msg10278 (view) Author: silvan Date: 2021-05-03.10:13:08
I opened issue1020 and issue1021 for those two wishes.

I also added an entry to CHANGES.md. Can merge this issue then?
msg10275 (view) Author: jendrik Date: 2021-04-30.19:45:34
For the SCP papers, I introduced a max_generated_patterns option for hillclimbing() and always set it to 200. This yields roughly the same coverage as max_time=100 when used together with Cartesian abstractions and systematic patterns up to size 2. Most importantly, it has the advantage of being deterministic (which is important when comparing different abstraction orders for SCP).
msg10272 (view) Author: malte Date: 2021-04-30.18:21:42
I also like using the explicit cpdbs(hillclimbing()) over ipdb(). But I think we should add to the documentation that ipdb() can be expressed like this, like I think we do for other cases (like astar, eager_greedy, lazy_greedy). Can someone open an issue for this?


Looking at the log, the new code isn't much faster in the initial stages of the hill-climbing run, but it becomes *a lot* faster over time as the patterns get much larger. It seems to have a very significant speed advantage for *large* PDBs. At the point where the old version times out after 920.245 seconds, the new version has only used 776.81 seconds, and based on a side-by-side look at the logs, this looks like a genuine performance improvement and not a case of extreme grid noise or whatever. Well, maybe. It is a bit hard to tell; it is also possible that the new config was one of the last ones running on a machine that was clearing out while it was running.

What the log clearly shows is that the bounds we use (max PDB size and max collection size) are not strongly enough correlated with memory usage. The last data point on max collection size we see is 18722780, which requires roughly 80 MB of space for the PDBs. But at this point we have already used around 3 GB of memory for the PDB collections.

I think what is going on here is that we only bound the size for the PDBs that we actually selected, but also keep around lots of other PDBs that we generated as candidates and didn't select. This is stupid and should probably be improved.

I think it would also be nice to open an issue for this -- basically, come up with better ways of controlling the memory of usage of iPDB. I don't think it would be that complicated.


A third point perhaps worth noting is that we run for 920 seconds despite a 900 second limit. Perhaps we should check-point the time more frequently.
msg10268 (view) Author: silvan Date: 2021-04-30.17:23:06
According to the logs, what you assumed is exactly what happens in this snake instances. The experiment is in /infai/sieverss/repos/downward/dev/experiments/issue1018/data/issue1018-v1 and the log files are in:
old: runs-03201-03300/03218/run.log
new: runs-08601-08700/08699/run.log

I think a simple option to restrict memory would be to use a lower value for collection_max_size.

I got myself to using cpdbs(hillcimbing()) over ipdb() because I think the latter obscures many things and because I also prefer to write hill climbing and compare it to, say, CEGAR, in papers for clarity. Technically, ipdb is its own plugin but it doesn't do more than adding the hill climbing options, parsing them, and manually creating the generator instead of parsing it.
msg10266 (view) Author: malte Date: 2021-04-30.13:04:36
Runtime actually improves. Search time increases a tiny bit in the first and third configuration, but overall runtime gets better, suggesting that PDB construction has become faster, which makes some sense for the nature of the change.

The reason why the search time score decreases in the middle configuration is because we lose the first snake task where the old version solved it with a search time of 0.01 sec. If this didn't happen, search time would improve in this configuration overall, and by more than the tiny bit that it gets worse in the other configurations.

It looks like the new version runs out of memory during PDB generation for this task. Should this happen? It's a bit stupid to run out of memory in a preprocessing phase that we could abort at any time. Not a problem strongly related to this issue, but if we see the algorithms doing something stupid, perhaps it's a good idea to make a note of it. The old version runs quite close to the memory limit, and it looks to me like it's saved only by the 900s timeout for PDB generation. The new version is faster in preprocessing and therefore manages to hit the memory limit within the 900s -- or at least that's what it looks like to me.

Do we still give frequent progress updates on the PDB hill-climbing in the log? If yes, I'd be interested in seeing the log for this task (old and new version), now that every line of output is accompanied by peak memory statistics. Can we find it on the grid somewhere?

To be clear, this is unrelated to this issue. The question is whether we should think about follow-up to look at bounding the memory of the hill-climbing PDBs better.

BTW, it looks like this is an iPDB-style confuration, but it doesn't use "ipdb". Is "ipdb" essentially an alias now? Then it would be good for the documentation to reflect this.
msg10263 (view) Author: silvan Date: 2021-04-30.11:10:18
Experiments look good to me:
https://ai.dmi.unibas.ch/_tmp_files/sieverss/issue1018-v1-issue1018-base-issue1018-v1-compare.html
Memory decreases at the cost of a minimal increase in runtime.
msg10259 (view) Author: malte Date: 2021-04-29.16:53:44
Looks good to me. An experiment would be good, for example with iPDB, to see if there is a performance change. If performance changes at all, I would expect the impact to be beneficial (faster computation of the perfect hash because we now use 4-byte quantities rather than 8-byte quantities), but in any case I think it would be good to run an experiment. I don't think it needs to cover many different kinds of PDB heuristics.
msg10258 (view) Author: silvan Date: 2021-04-29.15:08:20
Done and ready for a (quick) review.
msg10257 (view) Author: malte Date: 2021-04-29.13:01:26
I agree.
msg10255 (view) Author: silvan Date: 2021-04-29.12:08:47
For the moment, I only fixed the code by changing AbstractOperator: https://github.com/aibasel/downward/pull/39

I think, however, that we should rather go with int everywhere in PatternDatabase and friends, which means also switching num_states and hash_multipliers. What do you think?
msg10254 (view) Author: silvan Date: 2021-04-29.11:51:35
AbstractOperator stores its effect on ranked states (aka. hash indices) in form of a value change to the ranks/hash indices of states. These value changes clearly can be positive or negative. Currently, the code uses size_t for hash indices everywhere, including the effect. This causes overflows and only works correctly because size_t is guaranteed to "wrap around" when overflowing. We should switch at least the hash effect to be integers, and maybe switch all size_t in the PDB-related code to ints.
History
Date User Action Args
2021-05-04 10:24:49silvansetstatus: reviewing -> resolved
messages: + msg10282
2021-05-03 13:18:21maltesetmessages: + msg10280
2021-05-03 10:13:08silvansetmessages: + msg10278
2021-04-30 19:45:34jendriksetmessages: + msg10275
2021-04-30 18:21:43maltesetmessages: + msg10272
2021-04-30 17:23:06silvansetmessages: + msg10268
2021-04-30 13:04:36maltesetmessages: + msg10266
2021-04-30 11:10:19silvansetmessages: + msg10263
2021-04-29 17:51:56floriansetnosy: + florian
2021-04-29 16:53:44maltesetmessages: + msg10259
2021-04-29 15:08:21silvansetmessages: + msg10258
2021-04-29 13:01:26maltesetmessages: + msg10257
2021-04-29 12:08:48silvansetstatus: in-progress -> reviewing
messages: + msg10255
2021-04-29 11:51:36silvancreate