Issue213

Title Remove 32-bit and 64-bit build options.
Priority meta Status resolved
Superseder Nosy List andrew.coles, cedric, erez, florian, jendrik, malte, silvan, silvia
Assigned To jendrik Keywords
Optional summary
TODOs before we can close this issue:

- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as 
in 32-bit builds with 5 and 30 minute timeouts (this issue): done
- Make relaxation heuristics more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Run 5 minute experiments in debug mode: done
- Update wiki docs: done

Created on 2011-01-23.11:11:44 by erez, last changed by malte.

Summary
TODOs before we can close this issue:

- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as 
in 32-bit builds with 5 and 30 minute timeouts (this issue): done
- Make relaxation heuristics more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Run 5 minute experiments in debug mode: done
- Update wiki docs: done
Files
File name Uploaded Type Edit Remove
chunk_allocator.h andrew.coles, 2014-10-27.00:45:25 text/x-chdr
hash_sizes.py malte, 2016-12-12.23:21:13 text/x-python
state_registry_low_memory.tar.gz andrew.coles, 2014-10-27.14:01:11 application/binary
Messages
msg8494 (view) Author: malte Date: 2019-01-21.13:58:32
Many thanks! :-)
msg8493 (view) Author: jendrik Date: 2019-01-21.13:56:24
I sent the email and moved the TODOs to issue690 and issue890.
msg8488 (view) Author: jendrik Date: 2019-01-21.12:40:04
I'll take care of this.
msg8486 (view) Author: malte Date: 2019-01-21.12:33:26
Outstanding, many thanks to everyone involved! :-)

I have reopened this for now so that the "TODOs for later" in the summary don't
get lost, as we don't usually look in resolved issues for open TODOs. Can
someone convert the remaining ones into issues and add pointers here, like for
the first entry that refers to issue838? It could be a catch-all issue or
individual ones for each item.

I think this would be worth a message to the mailing list. The main points I
would emphasize are:

- We no longer cross-compile to 32 bits by default and instead use the OS's
default options (native bidwidth).

- We previously forced 32-bit builds becaused they were a lot more
memory-efficient. We have worked to reduce the memory usage of 64-bit builds so
that there is no longer a reason to prefer 32-bit builds.

- The main advantage of this change are that it's now possible to use more than
4 GiB of memory without making performance compromises, that builds are easier
because they don't require unusual libraries, and that it's easier to use CPLEX,
where recent versions are only available in 64-bit versions.
msg8473 (view) Author: jendrik Date: 2019-01-18.17:10:31
We just merged issue754 and updated the wiki documentation. This means all TODOs
for this issue are done now and we can close this issue :-)
msg8364 (view) Author: malte Date: 2018-12-15.17:33:39
> I don't see any blockers. Shall we go ahead and remove the option for 32-bit
> builds (issue754), Malte?

I think the results look good enough, yes. I suppose the merging of issue754
should be coordinated with the necessary buildbot and documentation changes, but
that discussion should be part of issue754.
msg8310 (view) Author: jendrik Date: 2018-12-13.23:04:46
Here are the fixed results for the 5-minute satisficing experiment:

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-sat-5min-release32-vs-release64.html

I don't see any blockers. Shall we go ahead and remove the option for 32-bit
builds (issue754), Malte?
msg8307 (view) Author: jendrik Date: 2018-12-13.20:56:17
Here are the results for the debug build:

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-opt-5min-debug.html
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-sat-5min-debug.html

The failed assertion is issue467.
msg8306 (view) Author: malte Date: 2018-12-13.20:27:32
Those LP numbers would certainly be fine in my opinion.

(One effect of the failing LP runs is that it's hard to check the output for
unexplained errors because there are so many, so I didn't really do that check.)
msg8304 (view) Author: jendrik Date: 2018-12-13.20:24:37
Regarding results for LP configurations, we still have the data from v7 
(https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-opt-30min.html)

Coverage increases for the diverse potential heuristics from 862 to 868 and
decreases for 899 to 892 for operator counting with state equation and LM-cut
constraints (this config is misleadingly called "seq" in the report).

Regarding the 5-minute satisficing experiment: the grid outage seems to have
affected this experiment. I have restarted it.
msg8302 (view) Author: malte Date: 2018-12-13.19:40:02
I added TODOs for later for CG and LM-Cut in the summary.
msg8301 (view) Author: malte Date: 2018-12-13.19:38:33
Focusing on the 30-minute results (unsurprisingly, the 5-minute results are
generally better):

- Optimal configurations look OK, except that LM-Cut isn't great. That's not
surprising given that we worked hard on performance enhancements for all delete
relaxation heuristics except for LM-Cut, which we didn't touch.

- Satisficing configurations look OK, except perhaps for h^cea. The unexplained
errors in the 5-minute experiments suggest that many runs were not started,
though. What's up there?

- The LAMA results look acceptable.

Regarding the LP results, it would be interesting to know if the results are
roughly in the same ballpark as old results or if they are massively worse. If
we lose 10-20 tasks, I wouldn't be too bothered. If we lose 100, I would.

This doesn't necessarily require new experiments if we have old data somewhere
that we can compare to.
msg8285 (view) Author: florian Date: 2018-12-13.13:19:17
> Florian and I briefly discussed the investigation into the operator-counting
> heuristics (msg7584). We believe the analysis is difficult to set up...

I don't think it is. I have old CPLEX versions and old OSI versions are
available online. The setup should be no more complicated than any other LP
tests. But I understand if you don't want to do it and I don't think we would be
able to finish this issue during this sprint if we would consider this a blocker.
msg8284 (view) Author: jendrik Date: 2018-12-13.12:57:39
We repeated all experiments for v7 with an updated revision (v8). This revision
is the default branch after merging issue814. Here are the results:

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-opt-5min-release32-vs-release64.html
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-opt-30min-release32-vs-release64.html

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-sat-5min-release32-vs-release64.html
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-sat-30min-release32-vs-release64.html

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-lama-5min-release32-vs-release64.html
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-lama-30min-release32-vs-release64.html

I will start the 5min debug experiments now.
msg8283 (view) Author: jendrik Date: 2018-12-13.12:38:22
Florian and I briefly discussed the investigation into the operator-counting
heuristics (msg7584). We believe the analysis is difficult to set up since we
need an old CPLEX version that supports 32-bit and 64-bit builds (the previous
results used different CPLEX versions for the two builds). If we see a
performance degradation for the old CPLEX version, it is unclear whether we
would see it for a newer CPLEX version as well and how we could fix it. We
therefore don't consider this a blocker for removing the 32-bit build option and
I updated the summary accordingly.
msg8166 (view) Author: jendrik Date: 2018-12-06.15:38:50
Update summary: we decided that we can tackle two TODOs after this issue.
msg8165 (view) Author: jendrik Date: 2018-12-06.15:28:55
We looked at the results again and found that cea() also solves fewer tasks (-10) 
using a 64-bit build. Added this to the summary.
msg7650 (view) Author: malte Date: 2018-09-20.15:57:22
> The hff portion of lama-first does ignore action costs:
[...]
> I looked into the results for a maintenance-sat14-adl task

Great, thanks for checking!
msg7647 (view) Author: jendrik Date: 2018-09-20.15:21:02
Add issue838 to summary.
msg7645 (view) Author: jendrik Date: 2018-09-20.14:51:42
Removing the synergy is issue839. I opened a new issue since we already have a 
branch for issue123 in the repo and wanted to avoid any possible confusion this 
could cause.
msg7642 (view) Author: jendrik Date: 2018-09-20.14:36:47
The hff portion of lama-first does ignore action costs:

$ ./fast-downward.py --alias lama-first ../benchmarks/parcprinter-opt11-strips/p01.pddl
[...]
Initial heuristic value for lama_synergy(lm_rhw(reasonable_orders = true, lm_cost_type = one), transform = adapt_costs(one)): 25
Initial heuristic value for ff_synergy(hlm): 21
[...]

I looked into the results for a maintenance-sat14-adl task (maintenance-1-3-060-180-5-001.pddl): both the FF and the landmark 
heuristic have the same initial h values in both lama-first versions, but afterwards the h values differ and lama-first-syn runs 
out of memory. This doesn't look like a bug to me. I think it could be due to tie-breaking and/or differences in the 
implementation.
msg7635 (view) Author: jendrik Date: 2018-09-20.13:18:15
I made some local measurements to analyze the increase in memory usage of the configuration 
using the causal graph heuristic:

peak memory for airport:p19, dynamically linked build, revision issue213-v7 (e1efb55725b3):

--heuristic "h=goalcount()" --search "lazy_greedy([h], preferred=[h], bound=0)":
32-bit:  7724 KB
64-bit: 21220 KB

--heuristic "h=goalcount()" --search "lazy_greedy([h], preferred=[h])":
32-bit: 13316 KB
64-bit: 26740 KB


--heuristic "h=cg()" --search "lazy_greedy([h], preferred=[h], bound=0)":
32-bit: 51156 KB
64-bit: 85228 KB

--heuristic "h=cg()" --search "lazy_greedy([h], preferred=[h])":
32-bit: 58552 KB
64-bit: 92540 KB


The data suggests that the memory overhead of the cg configuration stems from the 
initialization of the heuristic, not the search. In 32-bit mode the heuristic needs about 
51156-7724=43432 KB. In 64-bit mode it needs about 85228-21220=64008 KB. This is 
corroborated by the massif profiles for the cg heuristic (attached): the 32-bit version 
needs about 20 MB less memory than the 64-bit version at the peak. The culprit for the extra 
memory usage is exclusively the helpful_transition_cache 
(std::vector<std::vector<domain_transition_graph::ValueTransitionLabel *>> 
helpful_transition_cache;) in file cg_cache.h.
I opened issue838 for reducing the memory usage of the causal graph cache and suggest to 
continue our discussion about this there.
msg7632 (view) Author: malte Date: 2018-09-20.12:42:15
Thanks, Jendrik! Very interesting!

I suspected the synergy might not be useful anymore these days since the
"regular" heuristics have seen a lot of TLC that the synergy heuristics haven't.
That was the main reason why I suggested the experiment.

I think we should revisit getting rid of the synergy.

Looking at the option string for lama-first, does the hff portion actually
properly ignore the action costs? I assume perhaps yes, via the master, but can
you verify?

The coverage difference in maintenance-sat14-adl is striking. Can you have a
quick look at the run files for the version with synergy to check if we might be
overlooking a bug there?

There is an old issue about dropping the synergy, which we can either reopen, or
create a new one pointing to the old one for additional context. Then I suggest
we run (30-minute) experiments that compare lama-first to lama-nosyn-first and
lama to lama-nosyn (i.e., the anytime configuration). If the numbers are similar
to what we see here, there is a lot that can be said in favour of removing the
synergy.
msg7630 (view) Author: jendrik Date: 2018-09-20.12:08:45
Add "Investigate drop in performance of the operator-counting heuristics 
(msg7584)" to summary.
msg7629 (view) Author: jendrik Date: 2018-09-20.12:06:36
Here are the results for lama-first and lama-first without the synergy:

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-lama-first-30min.html

In both cases, the 32bit build solves more tasks overall than the 64bit build. Interestingly, the 
versions without synergy solve 24 (32bit) and 19 (64bit) more tasks. They also have higher time 
scores, but lower memory scores.
msg7584 (view) Author: florian Date: 2018-09-19.16:17:16
I would be interested to see if the drop in performance of the operator-counting
heuristics come from the change in CPLEX version or from our side. In the report
it was the configuration with the second-highest coverage and it had the
second-highest drop in coverage, so I think it is worth investigating.
msg7561 (view) Author: jendrik Date: 2018-09-19.12:28:35
We looked at the results and found that it would be good to create time and 
memory profiles for the configurations using cea() and cg().

Also, we want to run an experiment for LAMA-first without the synergy.
msg7514 (view) Author: malte Date: 2018-09-18.14:19:20
Thanks, Jendrik! For those of us keeping score, the total differences going from 
32 bits to 64 bits are currently (v7):

- opt-30min: coverage +17
  - worst config: BJOLP -15; this has not integrated the latest changes, I think
- sat-30min: -coverage 38
  - worst config: ff-typed -11
- anytime-30min: coverage -6, quality -7.25

- opt-5min: coverage +40
  - worst config: hmax -5
- sat-5min: coverage -40
  - worst config: lazy hadd -16
- anytime-5min: coverage -2, quality -2.48

Looking at coverage is good, but I think before we merge we should also look at 
score_memory and score_total_time. If we want to parallelize the issue, it would 
be useful if someone can have a look if we have a significant drop in performance 
that can not be explained by h^max, h^add or h^FF (which are covered in issue814).
msg7512 (view) Author: jendrik Date: 2018-09-18.13:59:26
The remaining experiments for v7 finished. Here are all result requested in msg7470:

http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-opt-30min.html
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-sat-30min.html
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-lama-30min.html

http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-opt-5min.html
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-sat-5min.html
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-lama-5min.html
msg7491 (view) Author: jendrik Date: 2018-09-17.15:40:47
I have started the remaining experiments.
msg7470 (view) Author: malte Date: 2018-09-15.15:32:31
Many thanks! This confirms that there is still work to do with the satisficing
configurations.

I suggest we also run an experiment with Lama (full), which I guess would need
to be separate from the non-anytime planners. So our full set of experiments for
the meta issue would have six parts:

- optimal, satisficing, anytime
- each run with a 5-minute or 30-minute timeout

If we have these six packaged together in the repository somewhere, we could
also use them in the future for similar whole-planner performance experiments.
I'm not suggesting anything fancy, just six experiment files within this issue
with a clear naming convention that contain all the experiments in one place,
i.e., include the "extra configs" etc. Something like

    issue231-v7-release32-vs-release64-{opt,sat,any}-{5min,30min}.py

or whatever makes sense. That would help someone like me doing the same kinds of
experiments in the future.

Based on the sprint experience, I suggest we also run copies of the 5-minute
versions in debug mode at least once. I don't think there's a need for 30-minute
debug experiments.
msg7469 (view) Author: jendrik Date: 2018-09-15.09:42:08
Here are the results for the satisficing configurations using a time limit of 30 minutes:

http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-sat-30min.html
msg7455 (view) Author: malte Date: 2018-09-14.11:17:39
Thanks! I would like to have all data in one place for the same revision at the
point where we actually make the transition from 32 bits to 64 bits, but we
don't need to do that right now. Because of the necessary performance
enhancements for some of the satisficing configurations, we will not close this
meta-issue during the current sprint, so there is no particular hurry.
msg7454 (view) Author: jendrik Date: 2018-09-14.11:01:48
The experiment does not contain the changes from issue695. If we don't run the experiment again, we can use the results from issue695 to judge whether BJOLP loses 
coverage when switching to 64-bit (http://ai.cs.unibas.ch/_tmp_files/simon/issue695-v2-opt-issue695-base-issue695-v2-compare.html). These results show that 
overall coverage remains almost the same (917 vs. 916 tasks).
msg7453 (view) Author: malte Date: 2018-09-14.10:26:50
Many thanks, Jendrik!

Some tasks in the experiment reach the hard limit for the log and are hence
aborted. We should look into this.

The BJOLP results don't look so good. Was the experiment started before or after
issue695 was merged?

Otherwise things look quite good! :-) I was surprised to see the strong coverage
of seq, until I realized that it isn't actually seq, but OCC with seq + LM-Cut
constraints. In future experiments, I'd prefer to call this "occ".
msg7449 (view) Author: jendrik Date: 2018-09-14.09:17:58
Here are the results for the optimal configurations with a 30 minute time limit:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-issue213-v7-opt-30min.html
msg7432 (view) Author: jendrik Date: 2018-09-13.14:14:56
Here are the results for the additional optimal configurations:

http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-issue213-v7-opt-extra-configs.html
msg7430 (view) Author: malte Date: 2018-09-13.12:30:49
For the experiments, we discussed offline whether it's OK to use 5 minute
timeouts throughout (as we did in most experiments here) because the experiments
are large. I think we will eventually also need to do experiments with 30-minute
timeouts.

The short experiments help us catch cases where the 64-bit becomes a lot slower,
but they help us less when the issue is running out of memory. One can try to
simulate the memory pressure by also setting a lower memory limit, but it's hard
to calibrate, and I think it would be useful to have the data for our "real"
standard time and memory settings.

This shouldn't keep up everything else: I think it's fine to first find out what
needs to improve based on 5-minute runs and do 30-minutes at leisure when the
grid is less busy.
msg7427 (view) Author: jendrik Date: 2018-09-13.12:14:06
The experiment using the type-based open list (with lama-first) is done:

http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-issue213-v7-sat-extra-configs.html

I think it's best to wait for issue814 before we try to improve the type-based config.
msg7425 (view) Author: malte Date: 2018-09-13.11:58:32
I changed the title of this meta issue to reflect what it is now actually about.

Also, after looking at the pull request for issue754, I realize that we have in
some sense been wrong in saying that this issue is about "switching to 64-bit".
Rather, what we are doing is switch to *native bit-size* builds only. Whether
these are 32-bit builds or 64-bit builds depends on the operating system and
compiler environment.

So we should be aware in the future that it is still possible that the code is
compiled on a 32-bit platform. For example, a long may be 32 bits only, and if
we need a 64-bit data type, we need to use something that is explicitly 64 bits.
msg7422 (view) Author: jendrik Date: 2018-09-13.11:45:15
Experiments for additional configs are running.
msg7399 (view) Author: malte Date: 2018-09-12.14:36:34
Many thanks, Jendrik! Indeed it looks like time is the issue, not memory.

So we'll have issue814 look at time and memory instead of only memory. We have
already found the wiki documentation on time profiling.

In some cases, the behaviour seems to vary quite a bit depending on which
heuristic is used. Do we have sufficient coverage of heuristics in these
experiments?

More generally, do we include all major planner components we care about? How
difficult would it be, for example, to run all the configurations used in the
IPC (including portfolios), or rather the closest approximations of these
configurations that we have in the master repository? I mean both the
satisficing and the optimal case.

It's not necessary to run the exact configurations, but if there are some larger
parts of the planner, like a specific heuristic or open list or search algorithm
used in the IPC that we don't currently cover, then it would be good to change
that. For example, I think stubborn sets can be quite time-critical, but we
don't currently cover them, right?
msg7397 (view) Author: jendrik Date: 2018-09-12.14:06:38
Interestingly, at least for lazy greedy add(), the reason for solving fewer tasks 
is not the extra memory, but running out of time. But I guess we can hope that 
using less memory makes the code more cache efficient again.
msg7394 (view) Author: jendrik Date: 2018-09-12.13:47:00
I added memory profiling instructions at http://www.fast-
downward.org/ForDevelopers/Profiling.
msg7387 (view) Author: malte Date: 2018-09-12.11:19:36
Making lazy greedy search more memory-efficient in 64-bit mode is issue814.
msg7385 (view) Author: malte Date: 2018-09-12.11:14:28
Thanks, Jendrik! The question is at what point we are happy to move to 64-bit.
Some possible answers:

1. When 64-bit is as good or better than 32-bit across the board (or at least
close).
2. When 64-bit is as good or better than 32-bit on average.

If we just add up the coverage scores, we currently have +16 for 64-bit in the
optimal experiment and -30 for 64-bit in the satisficing experiment, so we are
not too far away. The numbers should get better when issue695 is integrated.

Of course that doesn't mean that we should ignore the existing memory
inefficiencies. I guess the next step there should be to do a memory profile for
one of the bad configurations in 32-bit and 64-bit mode? The worst configuration
seems to be lazy_greedy_add = [u'--heuristic', u'h=add()', u'--search',
u'lazy_greedy([h], preferred=[h])']

Jendrik, can you give some for-dummies instructions of how to do the memory
profiles that we used before to diagnose this?
msg7383 (view) Author: jendrik Date: 2018-09-12.09:30:12
I ran two experiments:

http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-issue213-v7-opt.html
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-issue213-v7-sat.html

It looks like there's still work to do, especially for the satisficing configurations.
msg7365 (view) Author: jendrik Date: 2018-09-11.12:13:52
I will most certainly not!
msg7363 (view) Author: malte Date: 2018-09-11.11:29:28
Please ignore this (email test).
msg7356 (view) Author: jendrik Date: 2018-09-10.12:12:09
Update summary.
msg6258 (view) Author: jendrik Date: 2017-04-27.23:33:47
Converted to meta-issue and updated the summary.
msg6253 (view) Author: malte Date: 2017-04-27.19:13:15
This issue is next in the review queue, but it really looks like a meta-issue
rather than a regular one. Can you

1) convert it into a meta-issue,
2) update the summary to include the current status and open TODOs, and
3) let me know which of the subissue is the one that should be reviewed next and
update the entry in the review queue accordingly?
msg5905 (view) Author: florian Date: 2016-12-19.14:08:13
Added a reference to issue687.
msg5890 (view) Author: jendrik Date: 2016-12-14.21:25:33
Of course we can test the new hash function on multiple optimal configurations. I 
don't expect any surprises if we're happy with the performance of blind search, 
but you never know :-)
msg5887 (view) Author: malte Date: 2016-12-14.18:28:50
> Regarding the first item: bjolp uses uniform cost partitioning, so CPLEX is not
> involved. Probably you meant seq instead of bjolp. I suggest to rerun the
> experiment for seq once we've merged the new hash table implementation.

Yes, I meant seq. Regarding rerunning the experiments, does this mean that you
advocate against testing the new hash table implementation on our standard
optimal planning configurations? Since it's a rather performance-critical
change, I'd be happier to have it evaluated on more than just blind search
before merging. (But perhaps this discussion should be in issue694.)

> Regarding the second item: I think it would be good to defer experiments until
> we're happy with optimal configurations.

Fine with me. I don't think we need to test the satisficing configurations for
issue692, but I think we should for issue693 (some domains have massively larger
states, so hash function performance could be interesting) and for issue694 (for
similar reasons). But of course we can focus on optimal configurations for these
issues first.

> We could convert the lazy open list 
> operator pointers in issue688. Or do you prefer a separate issue for that?

Separate issues are usually easier to review and hence faster to be merged. This
is especially true when we expect a performance impact from some of the changes
and want to know where it comes from. So I think in this case a separate issue
would be better.
msg5886 (view) Author: jendrik Date: 2016-12-14.18:20:58
Regarding the first item: bjolp uses uniform cost partitioning, so CPLEX is not 
involved. Probably you meant seq instead of bjolp. I suggest to rerun the 
experiment for seq once we've merged the new hash table implementation. 

Regarding the second item: I think it would be good to defer experiments until 
we're happy with optimal configurations. We could convert the lazy open list 
operator pointers in issue688. Or do you prefer a separate issue for that?
msg5885 (view) Author: malte Date: 2016-12-14.18:07:36
Great! For this issue, I can think of two more TODO items:

- Rerun the BJOLP comparison with identical OSI/CPLEX versions. (Ideally we
should use the most recent versions for both, but IIRC, this doesn't work
because the newest CPLEX version has 32-bit issues. But perhaps we can still use
our recommended OSI version for both and couple it with an older CPLEX version
that still supports both architectures. Florian can hopefully advise on this.)

- Run experiments for satisficing configurations. If we do this later when we're
happy with the changes affecting optimal planners, we'll have to run fewer
experiments overall. If we do it earlier, we might be quicker to notice other
parts of the code we need to change before moving to 64-bit compiles. Both
options are fine for me.

Right now I can only think of one obvious change we will want to make for the
satisficing search configurations: replace the generating operator pointer in
lazy open lists with an operator ID, similar to the change for SearchNodeInfo.
We might then also want to change the interface so that it works with these IDs
and we don't have to convert back and forth between operator pointers and IDs
when we actually use IDs on both ends.
msg5882 (view) Author: jendrik Date: 2016-12-14.15:47:41
Yes, the experiment only compares the old hash function and hash table to the new 
hash function and hash table. On the instance I tested locally, the old hash 
function basically produced ascending hash values, which is very bad for 
hopscotch hashing. 

I have split off the issues as you suggested:

A) issue692: operator IDs instead of generating operator pointers
B) issue693: new hash function
C) issue694: new hash table implementation
D) issue695: space improvement for LM-Count
msg5873 (view) Author: malte Date: 2016-12-14.03:55:26
Those numbers certainly look better. :-) There still seems to be a bit of a
slowdown in most tasks, but I think that would be an acceptable penalty given
the memory improvements.

I still think we should split this into separate issues, though. For example, if
we change the hash function and hash table at the same time (which I think the
experiment does?), the changes become harder to evaluate.

> In light of these results, I suggest we polish and merge this hash table
> implementation, before we try to come up with fancier versions :-) What do
> you think?

Sounds fine to me. (Regarding alternative hash table implementations, I was
really hoping that we could use something simpler rather than something more
complex. But either way, this can wait.)
msg5872 (view) Author: jendrik Date: 2016-12-14.01:33:41
I analyzed and tweaked the code a bit and was able to speed it up while preserving the same memory 
usage. The most important change was to not sort the old buckets during a resize. Previously, I 
was trying to be smart about this to achieve a better layout, but the extra effort didn't pay off. 
In fact, experiments showed that the order in which buckets are reinserted, doesn't affect memory 
usage or runtime.

The resulting code version is tagged revision v5. Here are the results comparing v5 to v1, the 
version using std::unordered_map:

http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-issue213-v1-vs-issue213-v5-release32.html
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-issue213-v1-vs-issue213-v5-release64.html

-> Memory usage and runtime decrease for both 32- and 64-bit mode. The effect is more pronounced 
for memory. Coverage goes up by 7 and 18 tasks.

http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-issue213-v1.html

-> With std::unordered_set, m64 solves 11 fewer tasks than m32. As we know, this is due to m64 
using much more memory than m32 in v1.

http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-issue213-v5.html

-> m64 still uses more memory than m32, but this can't be due to the hash set since it should uses 
the same amount of memory in both settings. The runtimes are about the same on average. Both build 
types solve the same number of tasks.

In light of these results, I suggest we polish and merge this hash table implementation, before we 
try to come up with fancier versions :-) What do you think?
msg5871 (view) Author: malte Date: 2016-12-13.17:17:10
Works for me, but the grid will be crowded then, and in an ideal world it would
be great to have some of the aspects of this already merged before then.
msg5870 (view) Author: jendrik Date: 2016-12-13.16:53:43
I think it's best to discuss in person during the sprint how we want to move 
forward with this.
msg5868 (view) Author: malte Date: 2016-12-12.23:28:01
Yet another point: it looks like so far we have only experimented with optimal
planning configurations? Before merging, we also need data for satisficing planning.
msg5867 (view) Author: malte Date: 2016-12-12.23:27:37
Taking the wider view: I am somewhat reluctant to rush merging the new hash
implementation because of its time overhead. But I don't think this should keep
us from merging other, less critical aspects of this that affect other parts of
the code.

Suggestion: we split off the different code changes (e.g. replacing generating
operator pointers by operator IDs, new hash function, new hash table
implementation, space improvement for lmcount heuristic) into their own issues,
and we use this issue for keeping track of the relative performance of m32 vs.
m64 and the eventual switch (hopefully) to m64 in the build configs etc.

I see the following largely independent changes:

A) operator IDs instead of generating operator pointers
B) new hash function
C) new hash table implementation
D) space improvement for LM-Count

If we split these off, hopefully we can at least merge A, B and D soon.
(Of course, D still needs to be done.)

What do you think?
msg5866 (view) Author: malte Date: 2016-12-12.23:21:13
Thanks!

I don't think tweaking the parameter is the best solution at the moment. Unless
there is a performance issue in the implementation that we still have to
identify, perhaps this hash table implementation just is a bit slower than the
standard library one. It stands to reason that the standard library
implementation is optimized for something. Perhaps it is optimized for speed. ;-)

There are a few alternatives we can try. We wanted to avoid separate chaining
because of its space requirements, but we can also try bringing the space
requirements of separate chaining down for our particular use case by
compressing the information. Currently, the values -2 to -2^31 are unused, and
we could exploit this to use a single 4-byte cell to store either a state ID or
(the equivalent of) a pointer to a chain of collided values.

Assuming a load factor of 1.0, which is not too unreasonable when using separate
chaining, a "basic" but space-efficient implementation of separate chaining
takes 12 bytes per entry if there is no dynamic allocation overhead (which I
think is reasonable for a hash table that does not support deletion): 4 bytes
per bucket plus 8 bytes per linked list entry that holds the data.

I think that with some compression tricks, we can bring this down to less than 7
bytes per entry, i.e., the same amount of memory as with closed hashing at a
load factor of 0.57. (Closed hashing of course needs load factors to be lower
than what we can use with open hashing.)

This assumes that we don't need to store the hash keys -- if we do need to store
them, the comparison becomes more favourable for closed hashing because the
relative overhead of the "pointers" becomes smaller. I'm attaching a Python
script that allows estimating the size requirements of different hash table
implementations.
msg5865 (view) Author: jendrik Date: 2016-12-12.16:29:02
I profiled blind search on pegsol-08-strips:p24.pddl. In 32-bit mode revision v1 
uses 7.5s and v3 uses 9.6s.

v1:
3.13% of the time is spent in StateRegistry::get_successor_state().
  StateRegistry::insert_id_or_pop_state() seems to be inlined.
    3.13% of the time is spent inserting states into the closed list.

v3:
18.18% of the time is spent in StateRegistry::get_successor_state().
  15.16% of the time is spent in StateRegistry::insert_id_or_pop_state().
    10.30% of the time is spent in IntHashSet::reserve().
      3.68% of the time is spent sorting the temporary vector.

One parameter, we could try changing is "max_distance". Currently each
key can be at most 32 buckets from its ideal bucket. Raising this
number would allow for higher load factors (and to a lesser extent
fewer resizes) at the cost of longer lookups.

For v3 in 64-bit mode changing this parameter for the pegsol task produced:

max_distance:     32      64     128
load factor:    0.43    0.86    0.86
peak memory:  100 MB   70 MB   70 MB  
resizes:          22      21      21
msg5864 (view) Author: florian Date: 2016-12-12.16:19:14
We have seen huge performance differences between OSI 0.103.0 and OSI 0.107.8.
If you want to compare 32 and 64 bit results, both revisions should use the same
version.
msg5863 (view) Author: jendrik Date: 2016-12-12.14:36:05
Regarding "v3: m32-vs-m64": Yes, exactly.

The experiments use the following library versions:

32-bit: OSI 0.103.0, CPLEX 12.5.1
64-bit: OSI 0.107.8, CPLEX 12.5.1

http://www.fast-downward.org/LPBuildInstructions mentions OSI 0.107.8 and CPLEX 12.6.3.
msg5862 (view) Author: malte Date: 2016-12-12.13:14:08
So some good news and some bad news, then!

I assume the second "v1: m32-vs-m64" line should be "v3: m32-vs-m64"?

I understand the desire to make the new m64 version look good, but a base-m32
vs. v3-m64 comparison is not meaningful for the decision we have to make in this
issue. We want to make the m64 compile competitive with the m32 compile. It
doesn't really matter if the m64 compile of revision X is competitive with the
m32 compile of revision Y.

Regarding the coverage drops for bjolp and seq:

The drop for seq may be CPLEX's fault, so not sure if we can do much about it.
Still, this probably warrants a closer look. I'd be a bit surprised to see the
32-bit version of CPLEX do massively better than the 64-bit version. Are these
otherwise identical versions of CPLEX and OSI? Are these the versions we recommend?

The drop for bjolp surprised me a bit, so I had a look how it stores a landmark
data, and it uses a PerStateInformation<vector<bool>>, which is very wasteful
because we pay the vector overhead for every single state. For our state
representation, we moved from SegmentedVector to SegmentedArrayVector for this
reason, and it made a big difference. We should probably do a similar thing
here, i.e., have something like PerStateInformation optimized for same-size
arrays. (And because this is a vector<bool>, we additionally need to pack
individual bits.)
msg5861 (view) Author: jendrik Date: 2016-12-12.11:56:48
Sure, here we go:

m32:

base-vs-v1: pointers and ints have the same size, so memory and runtime
don't change much. Coverage drops by 1 in total.

v1-vs-v3: memory usage is significantly reduced. Runtime results are
mixed. There's a slight negative trend for bjolp, cegar and ipdb and a
stronger negative trend for blind. Overall, all configurations benefit
from the change coverage stays the same or increases in all domains for
all configurations. Together, the configurations solve 61 more tasks.

m64:

base-vs-v1: both memory and total_time scores go up leading to 15
additional solved tasks.

v1-vs-v3: The picture is very similar to the 32-bit v1-vs-v3
comparison. Memory usage goes down, runtime goes up. The change in
memory is much more significant than the one for runtime though. In
total, coverage increases by 122 tasks.


base: m32-vs-m64 - Coverage drops for all configurations. Total coverage diff = -142
v1:   m32-vs-m64 - Coverage drops for all configurations. Total coverage diff = -136
v1:   m32-vs-m64 - Coverage drops by 20 and 32 for bjolp and seq, respectively. It
                   dropy by at most five for divpot, ipdb, lmcut and mas. It remains
                   the same for blind and cegar. Total coverage diff = -65


base-m32 vs. v3-m64: Memory usage increases significantly for most
configurations. Results for runtime are mixed. Coverage decreases by 12
and 33 for bjolp and seq, respectively. It increases for all other
configurations (+40 in total). Total coverage diff = -5.
msg5860 (view) Author: malte Date: 2016-12-11.23:42:54
That's a lot of data to go through. Can you summarize the results?
msg5859 (view) Author: jendrik Date: 2016-12-11.23:15:29
Thanks! Since I didn't expect you to look into this so soon, I went ahead and ran 
an experiment for more configurations:

http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-v3-opt.tar.gz

The reports make the following comparisons: 

- comare revisions base, v1 and v3 for a fixed build type.
- compare the two build types (m32 and m64) for a fixed revision.
- compare base-m32 to v3-m64
msg5858 (view) Author: malte Date: 2016-12-10.12:03:43
I added some comments on bitbucket.
msg5856 (view) Author: malte Date: 2016-12-09.11:17:37
Sounds good, but it may take a while. Lots on my plate at the moment.
msg5855 (view) Author: jendrik Date: 2016-12-09.09:51:58
I went over the IntHashMap code again and made some adjustments that speed 
things up while using the same amount of memory. For example, each key is now 
only hashed once. I'm sure there are further speed optimizations that I haven't 
discovered.

http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-v2-vs-v3-blind.html

Before I test other configurations, I think it would be good if you could have 
a look at the code, Malte.
msg5854 (view) Author: jendrik Date: 2016-12-08.17:56:16
Regarding 1) in msg5851: I completely agree, this experiment was just meant to be a quick way of assessing where 
we stand. I'll run more extensive experiments before merging.

Regarding 2) in msg5851: I have repeated the experiment for 32-bit builds:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-v2-blind-m32-issue213-v1-issue213-v2-compare.html

Here is the complete coverage table for blind search:

version   coverage
------------------
base-m32:      617
  v1-m32:      617
  v2-m32:      624
base-m64:      602
  v1-m64:      606
  v2-m64:      624
msg5853 (view) Author: malte Date: 2016-12-08.17:14:10
Regarding Andrew's suggestion, this is indeed another optimization we discussed
and should test at some point. It's already part of the longer-term plan (see
the SearchNodeInfo row and "ideal" column in msg5843).

I generally like doing this one optimization at a time because I think that
makes it easier to evaluate the impact of the various changes on memory,
coverage and runtime, but we should certainly not forget about this optimization
opportunity.

Let me point out that for certain search algorithms (greedy BFS without
reopening), we can also get rid of the g values altogether.
msg5852 (view) Author: malte Date: 2016-12-08.17:10:25
Hi Andrew, you're not the first (or second, or third ;-)) one to be tripped up
by the new "optional summary" field. Pasting your comment as a change note:

=========

Jendrik - re the changes for v1 - is there a need to store the parent operator
at all?  A while back I stripped this out of a fork of FD, and when a goal state
was reached, derived what the operators must have been by looking at the
differences between states on the path to the goal.  This has a small
postprocessing cost, but saves memory.
msg5851 (view) Author: malte Date: 2016-12-08.15:02:34
So far, so good! It would be nice to reduce the runtime penalty, but if it turns
out that we cannot, I don't think it would be a deal-breaker. Hopefully the
overhead will be much less noticeable with non-blind search settings.

I have two issues with the experiment:

1) If I understand correctly, the "base" numbers come from a separate
experiment. For the same reason that we randomize run IDs in our experiments, we
can get systematic errors if we combine data from two separate experiments. This
is less of an issue here than usually because these erros only affect runtime,
not memory usage. But I think we should still run a "clean" (all-in-one)
experiment before merging.

2) For a complete picture, we should have 32-bit and 64-bit data for all
configurations. That is, we should have data for v1-m32 and v2-m64, too.
msg5850 (view) Author: jendrik Date: 2016-12-08.14:53:00
Two changes for reducing memory consumption have been made in the pull request so far: 

issue213-v1 uses ints instead of GlobalOperator pointers for storing the parent operator.
issue213-v2 (in addition to changes by v1) uses IntHashSet instead of std::unordered_set for storing the closed-list.

I have run an experiment comparing v1 and v2 for blind search in a 64-bit build:

http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-v2-blind-m64-issue213-v1-issue213-v2-compare.html

Memory consumption is significantly reduced, but runtimes go up.

v2 recovers the loss in total coverage for blind search incurred by switching to 64-bit builds:  

version   coverage
------------------
base-m32:      617
base-m64:      602
  v1-m64:      606 
  v2-m64:      624
msg5849 (view) Author: jendrik Date: 2016-12-05.11:13:42
Concerning the CPLEX error: we decided to postpone fixing this until we find a 
reliable way to test the fix and opened issue691 for this.
msg5847 (view) Author: florian Date: 2016-12-05.10:51:20
> We could exit with EXIT_OUT_OF_MEMORY if the error message matches "CPXlpopt
> returned error 1001" (1001 stands for out of memory in the CPLEX lib). Not
> sure if that's the best solution though.

Sounds good. This is also the way we handle other error messages that come
through unexpected channels (e.g., OSI reports some CPLEX errors as warnings).
msg5846 (view) Author: jendrik Date: 2016-12-05.01:01:48
Minor remark: the task tested in msg5843 is actually elevators-opt11-strips:p04.pddl. In 
revision issue213-base it uses 102 MB in a 32-bit build and 152 MB in a 64-bit build.

I have implemented a new hash table class using open addressing instead of chaining. It 
decreases peak memory usage to 80 MB in 32-bit builds and to 82 MB in 64-bit builds. 

A critical ingredient was a new hash function. The old hash function 
(utils/hash.h:hash_sequence()) produced too many collisions, resulting in a final load 
factor of 0.05 and using ~400 MB (peak memory). Switching to Jenkins' one-at-a-time hash 
function (https://en.wikipedia.org/wiki/Jenkins_hash_function) increased the final load 
factor to 0.79.

Since coming up with his one-at-a-time hash function, Jenkins has released multiple 
improved hash functions under the public domain. I will try hashword() from
http://www.burtleburtle.net/bob/c/lookup3.c tomorrow. This seems to be one of the best 
non-cryptographic hash functions for variable-length integer sequences. He has also 
released SpookyHash, but it only works for 64-bit builds.

It would be great if someone could have a look at the implementation of IntHashMap. I 
opened a pull request at https://bitbucket.org/jendrikseipp/downward/pull-requests/63 .
msg5844 (view) Author: malte Date: 2016-11-30.19:16:01
One small addition: the total numbers in the last row of Jendrik's table assume
that the number of open list entries and the number of generated states are of
the same order of magnitude. For tasks with many duplicates, it is possible that
there are arbitrarily more open list entries, so that the open lists can become
the dominating factor for memory usage.

If we pursue the goal of minimizing memory usage, it would make sense to
introduce measurements to the planner that compare the maximum number of open
list entries with the number of generated states. If it turns out that open
lists are sometimes the bottleneck, we can rethink our current (quite lazy)
duplicate elimination scheme.
msg5843 (view) Author: jendrik Date: 2016-11-30.19:10:59
Malte and I analyzed the memory profiles of A* + blind search for pegsol-opt11-strips:p03.pddl today and noted 
where we use how much memory in a 32-bit, 64-bit and ideal version. Here are the numbers:


A* + blind search                                  32-bit          64-bit           ideal

---------------------------------------------------------------------------------------------------------------

PerStateInformation<SearchNodeInfo>              16 Bytes        24 Bytes         8 Bytes (parent state + g)
(parent state, creating operator, g, real g)

PerStateInformation<HEntry>                       4 Bytes         4 Bytes         0 Bytes (for blind search)
(h-cache)

State registry - state data                     4*B Bytes       4*B Bytes       4*B Bytes (B = #buckets)
(h-cache)

State registry - hash table                     ~17 Bytes       ~34 Bytes        ~5 Bytes (4 Bytes + overhead)
(h-cache)

Open list                                         4 Bytes         4 Bytes         4 Bytes (per open-list entry)

---------------------------------------------------------------------------------------------------------------

Total                                           ~45 Bytes       ~70 Bytes       ~21 Bytes (in example with B=1)


We have already started to reduce the size of SearchNodeInfo by storing operator IDs (ints) instead of 
GlobalOperator pointers. Next, we will be looking into more memory-efficient hash table implementations, probably 
using open addressing instead of separate chaining like std::unordered_set.
msg5842 (view) Author: jendrik Date: 2016-11-30.00:28:34
I agree. After looking at the code again, my fix make no sense. The error 
message apparently comes from COIN, not CPLEX. Our code detects this, calls 
handle_coin_error() which prints "Coin threw exception: ...". We could exit 
with EXIT_OUT_OF_MEMORY if the error message matches "CPXlpopt returned error 
1001" (1001 stands for out of memory in the CPLEX lib). Not sure if that's the 
best solution though.
msg5841 (view) Author: florian Date: 2016-11-29.21:34:00
Hard to say. The other error messages look more like the warning you saw on
stdout. So, maybe this is communicated over a different channel (wouldn't
surprise me). Is the "Coin threw exception:" stuff from our code? It would be
better if we were sure that the error actually passes through the error handler
that you modified. If you cannot reproduce the error, I would rather leave the
handler as it is than change it and hope that the change is correct.
msg5840 (view) Author: jendrik Date: 2016-11-29.21:18:41
I failed to reproduce the error on both my machine and maia. So I'm not sure how to test my fix. I pushed it to my 
bitbucket repo:

https://bitbucket.org/jendrikseipp/downward/commits/d19cd04fa4bb1547524573669ac83170c050cd5b?at=default

Florian, do you think this fix is correct?
msg5838 (view) Author: florian Date: 2016-11-29.14:45:21
The message "Compressing row and column files." is unrelated (that is just
CPLEX's last effort to conserve memory). We would have to catch error 1001 in
our custom error handler before it throws the exception.
msg5837 (view) Author: jendrik Date: 2016-11-29.14:38:37
Concerning the critical error: 

The output on stdout is

CPX0000  Compressing row and column files.

The output on stderr is

Coin threw exception: CPXlpopt returned error 1001
 from method resolve
 from class OsiCpxSolverInterface
Unexplained error occurred.

(The last line is from Fast Downward.)

Interestingly, the file "OsiDefaultName.xxx" is written to disk, containing a 
textual represenation of an LP.
msg5835 (view) Author: malte Date: 2016-11-29.14:14:50
It's a pity that the data isn't better. It's hard to justify moving to 64 bits
with such a large increase in memory usage (and loss in coverage). I think with
these numbers, we should do something about memory usage.

I suggest we focus on blind search first. There may be other memory-wasters
hidden inside certain heuristics etc., but whatever is making blind search use
much more memory will affect all configurations that expand/evaluate/generate
many states. 

Of course we should measure things before proceeding, but my best guess at the
moment is that the hash table (unordered_set) inside StateRegistry is to blame
since I cannot think of any other major data structures for which it is
plausible that they require much more memory in a 64-bit compile.

I would suggest that we
1) do a memory profile on some representative blind search test cases, in 32-bit
mode and 64-bit mode
2) look more closely at the main memory users and their underlying
implementation to see how they differ in 32-bit and 64-bit mode
3) think of more memory-efficient implementations

Perhaps the answer is simply to use another ready-made hash table
implementation, but I think it may be worthwhile to understand this more deeply
before we design a solution.
msg5834 (view) Author: florian Date: 2016-11-29.14:03:22
Looks like a huge satellite task running out of memory in an LP configuration. I
suspect CPLEX throws an error that we don't recognize as an out-of-memory
memory. This happened before, and we already have special casing for some of
these errors. Jendrik, if can figure out the exact error message from the logs,
we can add a special case for it.
msg5831 (view) Author: malte Date: 2016-11-29.13:57:52
Should we look into the critical error in that experiment?
msg5828 (view) Author: jendrik Date: 2016-11-29.11:54:22
Due to newer CPLEX versions removing support for 32-bit builds, the discussion 
about 64-bit builds came up again. I repeated Florian's experiment on the bigger 
benchmark set and with some new optimal configurations. The general picture 
remains the same though. Coverage decreases in many domains for all configs.

http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-opt-comparison.html

The question is whether and how we want to act on these results. This may be 
something we should discuss in a Fast Downward meeting.
msg4128 (view) Author: jendrik Date: 2015-03-31.13:38:22
issue436 (removing GNU hash sets and maps) has been addressed now, so work on 
this issue could continue.
msg3882 (view) Author: andrew.coles Date: 2014-10-27.14:01:11
(Background:

The standard hash_set implementation is a hashtable.  The hashtable is stored as
a std::vector, where each index contains a singly-linked list: an item of data,
and a 'next' pointer to the next item in the list. )

It's also possible to represent the lists in each bucket of a hash set by moving
the next pointer into the state.  Normally this wouldn't be done -- we can only
have one hash set implemented in this way -- but the state registry is special.
 In implementation terms, this can be done by making the PackedStateBin one int
larger, and instead of storing the 'next' entry as a pointer, store it as the
StateID of the next state.  The hash bucket themselves are then StateIDs: -1 if
empty, or the StateID of the first state on the list of states in that bucket.

Storing 'next' as part of the PackedStateBin gives a 25% reduction in memory
usage on 32-bit builds - see attached state_registry_low_memory.tar.gz.  The
downside is a drop in performance: blind search is a factor of 2 to 3 slower. 
(This surprised me, and might be improvable with some profiling.)  The % time
overhead would of course be lower with something other than blind search.
msg3881 (view) Author: florian Date: 2014-10-27.11:20:12
Jendrik already added them (hash sets etc. are handled in issue436).
msg3880 (view) Author: malte Date: 2014-10-27.11:18:00
Worth adding crossreferences between the issues?
msg3879 (view) Author: florian Date: 2014-10-27.11:17:39
Ahh, interesting. This looks very similar to the way we store state data in the
state registry. Maybe we can reuse some code here. But I think we should wait
until we changed the hash sets to the c++11 variants.
msg3878 (view) Author: andrew.coles Date: 2014-10-27.00:45:25
I've been looking into memory management recently*, and various hash set/hash
table options, and recalled there was a thread here on that topic.

You might want to try the attached replacement for the standard C++ allocator,
for use where objects will be persistent - e.g. nodes inside data structures. 
In state_registry.h switch the definition of StateIDSet to:

    typedef __gnu_cxx::hash_set<StateID,
                                StateIDSemanticHash,
                                StateIDSemanticEqual,
                                ChunkAllocator<StateID>  > StateIDSet;

A quick unscientific experiment doing this on a 32-bit build shows a 17% or so
reduction in memory usage.


* In IPC2011, POPF used 6GB of memory in 25s, so this is long overdue
msg3730 (view) Author: jendrik Date: 2014-10-08.21:30:42
I added a note in issue436.
msg3727 (view) Author: florian Date: 2014-10-08.20:37:02
I ran some memory profiles a while back and if I remember correctly, the hash
set entries contributed a large part to the total memory usage, so this seems
plausible. I vote to leave this issue open, add a reference to it on the issue
for hash sets (does it exist yet?) and do a memory profile once we finished
working on the hash sets.
msg3714 (view) Author: malte Date: 2014-10-08.12:10:23
Interesting and mildly surprising. The only reasonable explanation I can up with
for something like astar(blind()) is that this is due to the central hash_set in
the state registry. We might want to leave this one open and look at it again
when we revisit the hash tables, which is something we had on the screen anyway.
Or we might just close it as something that is the way it is. Preferences?
msg3702 (view) Author: florian Date: 2014-10-08.09:04:47
Seems like this is still an issue with the new state representation:

http://ai.cs.unibas.ch/_tmp_files/pommeren/issue213-base-issue213-base64-compare.html
msg3669 (view) Author: malte Date: 2014-10-06.12:20:19
I suggest blind(), BJOLP, lmcut(), ipdb() and our two common merge-and-shrink
configurations, on the optimal (incl. ipc11) benchmark suite.
msg3668 (view) Author: florian Date: 2014-10-06.12:16:55
> Florian, how much work would it be to investigate the time and memory of -m32
> vs. -m64 for some standard configurations?

Not too much. I could create two experimental branches, modify the Makefile in
them and set up an experiment for this. Which configs/tasks do you want to test?
msg3636 (view) Author: malte Date: 2014-10-04.19:42:03
I'm adding Florian to this issue as our "master state memory manager". I wonder
if this is still an issue with our new way of storing state information. Perhaps
we can get rid of "-m32"?

Florian, how much work would it be to investigate the time and memory of -m32
vs. -m64 for some standard configurations?
msg1608 (view) Author: malte Date: 2011-08-13.20:17:45
I made some more tests, which show that the behaviour already shows up with
blind search. Here's the relevant data; everything else (no. expanded states,
no. hash buckets in the closed list etc.) was reported identically with -m32 and
-m64.

For reference, this is from my new desktop machine (Xeon E31270 running 64-bit
Ubuntu 10.10 with gcc 4.4.4-14ubuntu5).

./downward-1 --search 'astar(blind())' < output:

woodworking-opt #12:

-m32:
Total time: 20.28s
Peak memory: 345372 KB

-m64:
Total time: 18.25s
Peak memory: 501048 KB

woodworking-opt #23:

-m32:
Total time: 11.67s
Peak memory: 214460 KB

-m64:
Total time: 10.69s
Peak memory: 309812 KB

blocks #9-0:

-m32:
Total time: 24.07s
Peak memory: 497516 KB

-m64:
Total time: 22.61s
Peak memory: 866388 KB
msg1607 (view) Author: malte Date: 2011-08-13.19:12:29
For future reference, here's the relevant excerpt of the original email that
discussed -m32 vs. -m64 memory usage:

===========================================================================
as a sanity test for our packaging, I did some experiments with the
final seq-opt-bjolp package this night, and noticed that it lost 17
(IIRC) problems compared to my previous tests, which is rather a lot.

Looking into this a bit more deeply, this seems to be due to -m64 using
a lot more memory than -m32. My tests were run with 2 GB (so this would
not be such an issue with the 6 GB in the competition), but still the
memory usage difference is rather large, and maybe we should do
something about this in the long run. I don't know if this is typical
behaviour or specific to LM configurations.

Example for the four toughest solved tasks in woodworking:

 #04: uses  737 MB with -m32, 1192 MB with -m64
 #14: uses 1277 MB with -m32, runs out of memory with -m64
 #15: uses 1583 MB with -m32, runs out of memory with -m64
 #24: uses  322 MB with -m32,  514 MB with -m64

I think there's nothing we could or should do about this for the
competition, but maybe it's worth opening an issue on this for later.
===========================================================================
msg1220 (view) Author: malte Date: 2011-01-23.11:29:53
It's true that the LM graph storage is not particularly efficient, but that
cannot explain memory usage in the gigabytes. In fact, I'd be surprised if the
landmark graph contributed even a single megabyte here. We should really
optimize where it matters first, and that means reducing *per-state* memory
cost, not per-problem memory cost. That means we can safely ignore 1) and 2),
although 3) is indeed an issue.

Also, going from vector<int> to vector<state_t> etc. on vectors that only
usually have one or two elements as in point 1) saves next to nothing (or even
exactly nothing due to memory alignment issues); the vector and dynamic-memory
management overhead dwarfs everything else here. Feel free to try the
optimizations in 1) and 2) out, since seeing is believing :-).

Note that so far we have no evidence that this memory explosion only happens for
the landmark configurations; they might happen everywhere. Landmark
configurations were the only ones I had the chance to look at so far.
msg1219 (view) Author: erez Date: 2011-01-23.11:11:44
The landmarks code is not very efficient with memory, and this is compounded in 
64-bit mode. Examples with BJOLP configuration that Malte ran:

Woodworking #04: uses  737 MB with -m32, 1192 MB with -m64
Woodworking #14: uses 1277 MB with -m32, runs out of memory with -m64
Woodworking #15: uses 1583 MB with -m32, runs out of memory with -m64
Woodworking #24: uses  322 MB with -m32,  514 MB with -m64

Here are some places where the landmarks code uses lots of memory:

1. LandmarkNode has:

   vector<int> vars
   vector<int> vals
   set<int> first_achievers
   set<int> possible_achievers
   hash_set<pair<int, int>, hash_int_pair> forward_orders

Should be - 

   vector<var_num_t>  vars
   vector<state_var_t> vals
   set<operator_num_t> first_achievers
   set<operator_num_t> possible_achievers
   hash_set<pair<var_num_t, state_var_t>, hash_int_pair> forward_orders

If var_num_t, operator_num_t and state_var_t are all 32 bits, that would reduce 
the memory footprint of the landmarks by almost a half (there are some other 
members).

2. LandmarksGraph has:
    vector<int> empty_pre_operators
    vector<vector<vector<int> > > operators_eff_lookup
    vector<vector<vector<int> > > operators_pre_lookup
    vector<vector<set<pair<int, int> > > > inconsistent_facts
    hash_map<pair<int, int>, LandmarkNode *, hash_int_pair> simple_lms_to_nodes
    hash_map<pair<int, int>, LandmarkNode *, hash_int_pair> disj_lms_to_nodes
    hash_map<pair<int, int>, Pddl_proposition, hash_int_pair> pddl_propositions

should be - 

    vector<operator_num_t> empty_pre_operators
    vector<vector<vector<operator_num_t> > > operators_eff_lookup
    vector<vector<vector<operator_num_t> > > operators_pre_lookup
    vector<vector<set<pair<num_vars_t, state_var_t> > > > inconsistent_facts
    hash_map<pair<num_vars_t, state_var_t>, LandmarkNode *, hash_int_pair> 
simple_lms_to_nodes
    hash_map<pair<num_vars_t, state_var_t>, LandmarkNode *, hash_int_pair> 
disj_lms_to_nodes
    hash_map<pair<num_vars_t, state_var_t>, Pddl_proposition, hash_int_pair> 
pddl_propositions

3. LandmarkStatusManager uses StateProxy. If we add an ID for each state, we can 
replace this with a number, which might use less memory.
History
Date User Action Args
2019-01-21 13:58:32maltesetmessages: + msg8494
2019-01-21 13:56:24jendriksetstatus: chatting -> resolved
messages: + msg8493
summary: TODOs before we can close this issue: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds with 5 and 30 minute timeouts (this issue): done - Make relaxation heuristics more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Run 5 minute experiments in debug mode: done - Update wiki docs: done TODOs for later: - Reduce memory usage of causal graph heuristic cache (issue838) - Improve performance of cea (use fewer pointers?) - Improve performance of cg? (perhaps performance is already OK, but similar ideas as for h^add and/or h^cea should be directly applicable) - Improve performance of LM-Cut (similar ideas as for h^add etc. could be applied) -> TODOs before we can close this issue: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds with 5 and 30 minute timeouts (this issue): done - Make relaxation heuristics more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Run 5 minute experiments in debug mode: done - Update wiki docs: done
2019-01-21 12:40:05jendriksetmessages: + msg8488
2019-01-21 12:33:27maltesetstatus: resolved -> chatting
messages: + msg8486
2019-01-18 17:10:31jendriksetstatus: in-progress -> resolved
messages: + msg8473
summary: TODOs before we can close this issue: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds with 5 and 30 minute timeouts (this issue) - Make relaxation heuristics more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Run 5 minute experiments in debug mode TODOs for later: - Reduce memory usage of causal graph heuristic cache (issue838) - Improve performance of cea (use fewer pointers?) - Improve performance of cg? (perhaps performance is already OK, but similar ideas as for h^add and/or h^cea should be directly applicable) - Improve performance of LM-Cut (similar ideas as for h^add etc. could be applied) -> TODOs before we can close this issue: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds with 5 and 30 minute timeouts (this issue): done - Make relaxation heuristics more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Run 5 minute experiments in debug mode: done - Update wiki docs: done TODOs for later: - Reduce memory usage of causal graph heuristic cache (issue838) - Improve performance of cea (use fewer pointers?) - Improve performance of cg? (perhaps performance is already OK, but similar ideas as for h^add and/or h^cea should be directly applicable) - Improve performance of LM-Cut (similar ideas as for h^add etc. could be applied)
2018-12-15 17:33:39maltesetmessages: + msg8364
2018-12-13 23:04:46jendriksetmessages: + msg8310
2018-12-13 20:56:17jendriksetmessages: + msg8307
2018-12-13 20:27:32maltesetmessages: + msg8306
2018-12-13 20:24:37jendriksetmessages: + msg8304
2018-12-13 19:40:02maltesetmessages: + msg8302
summary: TODOs before we can close this issue: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds with 5 and 30 minute timeouts (this issue) - Make relaxation heuristics more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Run 5 minute experiments in debug mode TODOs for later: - Reduce memory usage of causal graph heuristic cache (issue838) - Improve performance of cea (use fewer pointers?) -> TODOs before we can close this issue: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds with 5 and 30 minute timeouts (this issue) - Make relaxation heuristics more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Run 5 minute experiments in debug mode TODOs for later: - Reduce memory usage of causal graph heuristic cache (issue838) - Improve performance of cea (use fewer pointers?) - Improve performance of cg? (perhaps performance is already OK, but similar ideas as for h^add and/or h^cea should be directly applicable) - Improve performance of LM-Cut (similar ideas as for h^add etc. could be applied)
2018-12-13 19:38:33maltesetmessages: + msg8301
2018-12-13 13:19:17floriansetmessages: + msg8285
2018-12-13 12:57:39jendriksetmessages: + msg8284
2018-12-13 12:38:22jendriksetmessages: + msg8283
summary: TODOs before we can close this issue: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds with 5 and 30 minute timeouts (this issue) - Make relaxation heuristics more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Run 5 minute experiments in debug mode - Investigate drop in performance of the operator-counting heuristics (msg7584) TODOs for later: - Reduce memory usage of causal graph heuristic cache (issue838) - Improve performance of cea (use fewer pointers?) -> TODOs before we can close this issue: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds with 5 and 30 minute timeouts (this issue) - Make relaxation heuristics more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Run 5 minute experiments in debug mode TODOs for later: - Reduce memory usage of causal graph heuristic cache (issue838) - Improve performance of cea (use fewer pointers?)
2018-12-06 15:38:50jendriksetmessages: + msg8166
summary: Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Make lazy greedy search more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Before closing this, run 5 minute experiments in debug mode - Investigate drop in performance of the operator-counting heuristics (msg7584) - Reduce memory usage of causal graph heuristic cache (issue838) - Improve performance of cea (use fewer pointers?) -> TODOs before we can close this issue: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds with 5 and 30 minute timeouts (this issue) - Make relaxation heuristics more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Run 5 minute experiments in debug mode - Investigate drop in performance of the operator-counting heuristics (msg7584) TODOs for later: - Reduce memory usage of causal graph heuristic cache (issue838) - Improve performance of cea (use fewer pointers?)
2018-12-06 15:28:55jendriksetnosy: + cedric
messages: + msg8165
summary: Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Make lazy greedy search more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Before closing this, run 5 minute experiments in debug mode - Investigate drop in performance of the operator-counting heuristics (msg7584) - Reduce memory usage of causal graph heuristic cache (issue838) -> Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Make lazy greedy search more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Before closing this, run 5 minute experiments in debug mode - Investigate drop in performance of the operator-counting heuristics (msg7584) - Reduce memory usage of causal graph heuristic cache (issue838) - Improve performance of cea (use fewer pointers?)
2018-09-20 15:57:22maltesetmessages: + msg7650
2018-09-20 15:21:02jendriksetmessages: + msg7647
summary: Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Make lazy greedy search more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Before closing this, run 5 minute experiments in debug mode. - Investigate drop in performance of the operator-counting heuristics (msg7584) -> Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Make lazy greedy search more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Before closing this, run 5 minute experiments in debug mode - Investigate drop in performance of the operator-counting heuristics (msg7584) - Reduce memory usage of causal graph heuristic cache (issue838)
2018-09-20 14:51:42jendriksetmessages: + msg7645
2018-09-20 14:36:47jendriksetmessages: + msg7642
2018-09-20 13:18:15jendriksetmessages: + msg7635
2018-09-20 12:42:15maltesetmessages: + msg7632
2018-09-20 12:08:45jendriksetmessages: + msg7630
summary: Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Make lazy greedy search more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Before closing this, run 5 minute experiments in debug mode. -> Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Make lazy greedy search more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Before closing this, run 5 minute experiments in debug mode. - Investigate drop in performance of the operator-counting heuristics (msg7584)
2018-09-20 12:06:36jendriksetmessages: + msg7629
2018-09-19 16:17:16floriansetmessages: + msg7584
2018-09-19 12:28:35jendriksetmessages: + msg7561
2018-09-18 14:19:20maltesetmessages: + msg7514
2018-09-18 13:59:26jendriksetmessages: + msg7512
2018-09-17 15:40:47jendriksetmessages: + msg7491
summary: Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Make lazy greedy search more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) -> Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Make lazy greedy search more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) - Before closing this, run 5 minute experiments in debug mode.
2018-09-17 08:45:11silvansetnosy: + silvan
2018-09-15 15:32:31maltesetmessages: + msg7470
2018-09-15 09:42:08jendriksetmessages: + msg7469
2018-09-14 11:17:39maltesetmessages: + msg7455
2018-09-14 11:01:48jendriksetmessages: + msg7454
2018-09-14 10:26:50maltesetmessages: + msg7453
2018-09-14 09:17:58jendriksetmessages: + msg7449
2018-09-13 14:14:56jendriksetmessages: + msg7432
2018-09-13 12:30:49maltesetmessages: + msg7430
2018-09-13 12:14:06jendriksetmessages: + msg7427
2018-09-13 11:58:32maltesetmessages: + msg7425
title: inefficient memory use with -m64 setting -> Remove 32-bit and 64-bit build options.
2018-09-13 11:45:15jendriksetstatus: chatting -> in-progress
messages: + msg7422
summary: Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Make lazy greedy search more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) -> Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Make lazy greedy search more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754)
2018-09-12 14:36:34maltesetmessages: + msg7399
summary: Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Make lazy greedy search more memory-efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754) -> Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Make lazy greedy search more efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754)
2018-09-12 14:06:38jendriksetmessages: + msg7397
2018-09-12 13:47:00jendriksetmessages: + msg7394
2018-09-12 11:19:37maltesetmessages: + msg7387
summary: Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Remove option for 32-bit builds from the build system (issue754) -> Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Make lazy greedy search more memory-efficient in 64-bit mode (issue814) - Remove option for 32-bit builds from the build system (issue754)
2018-09-12 11:14:28maltesetmessages: + msg7385
2018-09-12 09:30:12jendriksetmessages: + msg7383
2018-09-11 12:13:52jendriksetmessages: + msg7365
summary: Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Remove option for 32-bit builds from the build system (issue754) -> Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Remove option for 32-bit builds from the build system (issue754)
2018-09-11 11:29:28maltesetmessages: + msg7363
2018-09-10 12:12:09jendriksetmessages: + msg7356
summary: Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) After this is done, we should consider changing the build system (e.g., make 64-bit the default build, or remove the option altogether). This is issue754. -> Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) - Test that most important configurations perform as well in 64-bit builds as in 32-bit builds (this issue) - Remove option for 32-bit builds from the build system (issue754)
2017-12-01 10:29:11jendriksetsummary: Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) After this is done, we should consider changing the build system (e.g., make 64-bit the default build, or remove the option altogether). This is related to issue687. -> Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) After this is done, we should consider changing the build system (e.g., make 64-bit the default build, or remove the option altogether). This is issue754.
2017-04-27 23:33:47jendriksetpriority: feature -> meta
messages: + msg6258
summary: After this is done, we should consider changing the build system (e.g., make 64-bit the default build, or remove the option altogether). This is related to issue687. -> Subissues: - Use operator IDs instead of generating operator pointers (issue692) - Implement new hash function (issue693) - Implement new hash table (issue694) (blocked by issue693) - Store landmarks more efficiently (issue695) After this is done, we should consider changing the build system (e.g., make 64-bit the default build, or remove the option altogether). This is related to issue687.
2017-04-27 19:13:15maltesetmessages: + msg6253
2016-12-19 14:08:13floriansetmessages: + msg5905
summary: After this is done, we should consider changing the build system (e.g., make 64-bit the default build, or remove the option altogether). This is related to issue687.
2016-12-14 21:25:33jendriksetmessages: + msg5890
2016-12-14 18:28:50maltesetmessages: + msg5887
2016-12-14 18:20:58jendriksetstatus: reviewing -> chatting
messages: + msg5886
2016-12-14 18:07:36maltesetmessages: + msg5885
2016-12-14 15:47:41jendriksetmessages: + msg5882
2016-12-14 03:55:26maltesetmessages: + msg5873
2016-12-14 01:33:41jendriksetmessages: + msg5872
2016-12-13 17:17:10maltesetmessages: + msg5871
2016-12-13 16:53:43jendriksetmessages: + msg5870
2016-12-12 23:28:01maltesetmessages: + msg5868
2016-12-12 23:27:37maltesetmessages: + msg5867
2016-12-12 23:21:13maltesetfiles: + hash_sizes.py
messages: + msg5866
2016-12-12 16:29:02jendriksetmessages: + msg5865
2016-12-12 16:19:14floriansetmessages: + msg5864
2016-12-12 14:36:05jendriksetmessages: + msg5863
2016-12-12 13:14:08maltesetmessages: + msg5862
2016-12-12 11:56:48jendriksetmessages: + msg5861
2016-12-11 23:42:54maltesetmessages: + msg5860
2016-12-11 23:15:29jendriksetmessages: + msg5859
2016-12-10 12:03:43maltesetmessages: + msg5858
2016-12-09 11:17:37maltesetmessages: + msg5856
2016-12-09 09:51:58jendriksetmessages: + msg5855
2016-12-08 17:56:16jendriksetmessages: + msg5854
2016-12-08 17:14:10maltesetmessages: + msg5853
2016-12-08 17:10:25maltesetmessages: + msg5852
summary: Jendrik - re the changes for v1 - is there a need to store the parent operator at all? A while back I stripped this out of a fork of FD, and when a goal state was reached, derived what the operators must have been by looking at the differences between states on the path to the goal. This has a small postprocessing cost, but saves memory. -> (no value)
2016-12-08 17:06:09andrew.colessetsummary: Jendrik - re the changes for v1 - is there a need to store the parent operator at all? A while back I stripped this out of a fork of FD, and when a goal state was reached, derived what the operators must have been by looking at the differences between states on the path to the goal. This has a small postprocessing cost, but saves memory.
2016-12-08 15:02:34maltesetmessages: + msg5851
2016-12-08 14:53:00jendriksetmessages: + msg5850
2016-12-05 11:13:42jendriksetmessages: + msg5849
2016-12-05 10:51:20floriansetmessages: + msg5847
2016-12-05 01:01:48jendriksetstatus: chatting -> reviewing
messages: + msg5846
2016-11-30 19:16:01maltesetmessages: + msg5844
2016-11-30 19:10:59jendriksetassignedto: jendrik
messages: + msg5843
2016-11-30 00:28:34jendriksetmessages: + msg5842
2016-11-29 21:34:00floriansetmessages: + msg5841
2016-11-29 21:18:41jendriksetmessages: + msg5840
2016-11-29 14:45:21floriansetmessages: + msg5838
2016-11-29 14:38:37jendriksetmessages: + msg5837
2016-11-29 14:14:50maltesetmessages: + msg5835
2016-11-29 14:03:22floriansetmessages: + msg5834
2016-11-29 13:57:52maltesetmessages: + msg5831
2016-11-29 11:54:22jendriksetmessages: + msg5828
2015-03-31 13:38:22jendriksetmessages: + msg4128
2014-10-27 14:01:11andrew.colessetfiles: + state_registry_low_memory.tar.gz
messages: + msg3882
2014-10-27 11:20:12floriansetmessages: + msg3881
2014-10-27 11:18:00maltesetmessages: + msg3880
2014-10-27 11:17:39floriansetmessages: + msg3879
2014-10-27 00:45:25andrew.colessetfiles: + chunk_allocator.h
nosy: + andrew.coles
messages: + msg3878
2014-10-08 21:30:42jendriksetnosy: + jendrik
messages: + msg3730
2014-10-08 20:37:02floriansetmessages: + msg3727
2014-10-08 12:10:23maltesetmessages: + msg3714
2014-10-08 09:04:47floriansetmessages: + msg3702
2014-10-06 12:20:19maltesetmessages: + msg3669
2014-10-06 12:16:55floriansetmessages: + msg3668
2014-10-04 19:42:03maltesetnosy: + florian
messages: + msg3636
2011-08-13 20:17:45maltesetmessages: + msg1608
2011-08-13 19:12:29maltesetmessages: + msg1607
2011-01-23 11:30:08maltesettitle: Inefficient Memory Use of Landmarks -> inefficient memory use with -m64 setting
2011-01-23 11:29:53maltesetmessages: + msg1220
2011-01-23 11:11:44erezcreate