msg8494 (view) |
Author: malte |
Date: 2019-01-21.13:58:32 |
|
Many thanks! :-)
|
msg8493 (view) |
Author: jendrik |
Date: 2019-01-21.13:56:24 |
|
I sent the email and moved the TODOs to issue690 and issue890.
|
msg8488 (view) |
Author: jendrik |
Date: 2019-01-21.12:40:04 |
|
I'll take care of this.
|
msg8486 (view) |
Author: malte |
Date: 2019-01-21.12:33:26 |
|
Outstanding, many thanks to everyone involved! :-)
I have reopened this for now so that the "TODOs for later" in the summary don't
get lost, as we don't usually look in resolved issues for open TODOs. Can
someone convert the remaining ones into issues and add pointers here, like for
the first entry that refers to issue838? It could be a catch-all issue or
individual ones for each item.
I think this would be worth a message to the mailing list. The main points I
would emphasize are:
- We no longer cross-compile to 32 bits by default and instead use the OS's
default options (native bidwidth).
- We previously forced 32-bit builds becaused they were a lot more
memory-efficient. We have worked to reduce the memory usage of 64-bit builds so
that there is no longer a reason to prefer 32-bit builds.
- The main advantage of this change are that it's now possible to use more than
4 GiB of memory without making performance compromises, that builds are easier
because they don't require unusual libraries, and that it's easier to use CPLEX,
where recent versions are only available in 64-bit versions.
|
msg8473 (view) |
Author: jendrik |
Date: 2019-01-18.17:10:31 |
|
We just merged issue754 and updated the wiki documentation. This means all TODOs
for this issue are done now and we can close this issue :-)
|
msg8364 (view) |
Author: malte |
Date: 2018-12-15.17:33:39 |
|
> I don't see any blockers. Shall we go ahead and remove the option for 32-bit
> builds (issue754), Malte?
I think the results look good enough, yes. I suppose the merging of issue754
should be coordinated with the necessary buildbot and documentation changes, but
that discussion should be part of issue754.
|
msg8310 (view) |
Author: jendrik |
Date: 2018-12-13.23:04:46 |
|
Here are the fixed results for the 5-minute satisficing experiment:
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-sat-5min-release32-vs-release64.html
I don't see any blockers. Shall we go ahead and remove the option for 32-bit
builds (issue754), Malte?
|
msg8307 (view) |
Author: jendrik |
Date: 2018-12-13.20:56:17 |
|
Here are the results for the debug build:
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-opt-5min-debug.html
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-sat-5min-debug.html
The failed assertion is issue467.
|
msg8306 (view) |
Author: malte |
Date: 2018-12-13.20:27:32 |
|
Those LP numbers would certainly be fine in my opinion.
(One effect of the failing LP runs is that it's hard to check the output for
unexplained errors because there are so many, so I didn't really do that check.)
|
msg8304 (view) |
Author: jendrik |
Date: 2018-12-13.20:24:37 |
|
Regarding results for LP configurations, we still have the data from v7
(https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-opt-30min.html)
Coverage increases for the diverse potential heuristics from 862 to 868 and
decreases for 899 to 892 for operator counting with state equation and LM-cut
constraints (this config is misleadingly called "seq" in the report).
Regarding the 5-minute satisficing experiment: the grid outage seems to have
affected this experiment. I have restarted it.
|
msg8302 (view) |
Author: malte |
Date: 2018-12-13.19:40:02 |
|
I added TODOs for later for CG and LM-Cut in the summary.
|
msg8301 (view) |
Author: malte |
Date: 2018-12-13.19:38:33 |
|
Focusing on the 30-minute results (unsurprisingly, the 5-minute results are
generally better):
- Optimal configurations look OK, except that LM-Cut isn't great. That's not
surprising given that we worked hard on performance enhancements for all delete
relaxation heuristics except for LM-Cut, which we didn't touch.
- Satisficing configurations look OK, except perhaps for h^cea. The unexplained
errors in the 5-minute experiments suggest that many runs were not started,
though. What's up there?
- The LAMA results look acceptable.
Regarding the LP results, it would be interesting to know if the results are
roughly in the same ballpark as old results or if they are massively worse. If
we lose 10-20 tasks, I wouldn't be too bothered. If we lose 100, I would.
This doesn't necessarily require new experiments if we have old data somewhere
that we can compare to.
|
msg8285 (view) |
Author: florian |
Date: 2018-12-13.13:19:17 |
|
> Florian and I briefly discussed the investigation into the operator-counting
> heuristics (msg7584). We believe the analysis is difficult to set up...
I don't think it is. I have old CPLEX versions and old OSI versions are
available online. The setup should be no more complicated than any other LP
tests. But I understand if you don't want to do it and I don't think we would be
able to finish this issue during this sprint if we would consider this a blocker.
|
msg8284 (view) |
Author: jendrik |
Date: 2018-12-13.12:57:39 |
|
We repeated all experiments for v7 with an updated revision (v8). This revision
is the default branch after merging issue814. Here are the results:
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-opt-5min-release32-vs-release64.html
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-opt-30min-release32-vs-release64.html
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-sat-5min-release32-vs-release64.html
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-sat-30min-release32-vs-release64.html
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-lama-5min-release32-vs-release64.html
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-v8-lama-30min-release32-vs-release64.html
I will start the 5min debug experiments now.
|
msg8283 (view) |
Author: jendrik |
Date: 2018-12-13.12:38:22 |
|
Florian and I briefly discussed the investigation into the operator-counting
heuristics (msg7584). We believe the analysis is difficult to set up since we
need an old CPLEX version that supports 32-bit and 64-bit builds (the previous
results used different CPLEX versions for the two builds). If we see a
performance degradation for the old CPLEX version, it is unclear whether we
would see it for a newer CPLEX version as well and how we could fix it. We
therefore don't consider this a blocker for removing the 32-bit build option and
I updated the summary accordingly.
|
msg8166 (view) |
Author: jendrik |
Date: 2018-12-06.15:38:50 |
|
Update summary: we decided that we can tackle two TODOs after this issue.
|
msg8165 (view) |
Author: jendrik |
Date: 2018-12-06.15:28:55 |
|
We looked at the results again and found that cea() also solves fewer tasks (-10)
using a 64-bit build. Added this to the summary.
|
msg7650 (view) |
Author: malte |
Date: 2018-09-20.15:57:22 |
|
> The hff portion of lama-first does ignore action costs:
[...]
> I looked into the results for a maintenance-sat14-adl task
Great, thanks for checking!
|
msg7647 (view) |
Author: jendrik |
Date: 2018-09-20.15:21:02 |
|
Add issue838 to summary.
|
msg7645 (view) |
Author: jendrik |
Date: 2018-09-20.14:51:42 |
|
Removing the synergy is issue839. I opened a new issue since we already have a
branch for issue123 in the repo and wanted to avoid any possible confusion this
could cause.
|
msg7642 (view) |
Author: jendrik |
Date: 2018-09-20.14:36:47 |
|
The hff portion of lama-first does ignore action costs:
$ ./fast-downward.py --alias lama-first ../benchmarks/parcprinter-opt11-strips/p01.pddl
[...]
Initial heuristic value for lama_synergy(lm_rhw(reasonable_orders = true, lm_cost_type = one), transform = adapt_costs(one)): 25
Initial heuristic value for ff_synergy(hlm): 21
[...]
I looked into the results for a maintenance-sat14-adl task (maintenance-1-3-060-180-5-001.pddl): both the FF and the landmark
heuristic have the same initial h values in both lama-first versions, but afterwards the h values differ and lama-first-syn runs
out of memory. This doesn't look like a bug to me. I think it could be due to tie-breaking and/or differences in the
implementation.
|
msg7635 (view) |
Author: jendrik |
Date: 2018-09-20.13:18:15 |
|
I made some local measurements to analyze the increase in memory usage of the configuration
using the causal graph heuristic:
peak memory for airport:p19, dynamically linked build, revision issue213-v7 (e1efb55725b3):
--heuristic "h=goalcount()" --search "lazy_greedy([h], preferred=[h], bound=0)":
32-bit: 7724 KB
64-bit: 21220 KB
--heuristic "h=goalcount()" --search "lazy_greedy([h], preferred=[h])":
32-bit: 13316 KB
64-bit: 26740 KB
--heuristic "h=cg()" --search "lazy_greedy([h], preferred=[h], bound=0)":
32-bit: 51156 KB
64-bit: 85228 KB
--heuristic "h=cg()" --search "lazy_greedy([h], preferred=[h])":
32-bit: 58552 KB
64-bit: 92540 KB
The data suggests that the memory overhead of the cg configuration stems from the
initialization of the heuristic, not the search. In 32-bit mode the heuristic needs about
51156-7724=43432 KB. In 64-bit mode it needs about 85228-21220=64008 KB. This is
corroborated by the massif profiles for the cg heuristic (attached): the 32-bit version
needs about 20 MB less memory than the 64-bit version at the peak. The culprit for the extra
memory usage is exclusively the helpful_transition_cache
(std::vector<std::vector<domain_transition_graph::ValueTransitionLabel *>>
helpful_transition_cache;) in file cg_cache.h.
I opened issue838 for reducing the memory usage of the causal graph cache and suggest to
continue our discussion about this there.
|
msg7632 (view) |
Author: malte |
Date: 2018-09-20.12:42:15 |
|
Thanks, Jendrik! Very interesting!
I suspected the synergy might not be useful anymore these days since the
"regular" heuristics have seen a lot of TLC that the synergy heuristics haven't.
That was the main reason why I suggested the experiment.
I think we should revisit getting rid of the synergy.
Looking at the option string for lama-first, does the hff portion actually
properly ignore the action costs? I assume perhaps yes, via the master, but can
you verify?
The coverage difference in maintenance-sat14-adl is striking. Can you have a
quick look at the run files for the version with synergy to check if we might be
overlooking a bug there?
There is an old issue about dropping the synergy, which we can either reopen, or
create a new one pointing to the old one for additional context. Then I suggest
we run (30-minute) experiments that compare lama-first to lama-nosyn-first and
lama to lama-nosyn (i.e., the anytime configuration). If the numbers are similar
to what we see here, there is a lot that can be said in favour of removing the
synergy.
|
msg7630 (view) |
Author: jendrik |
Date: 2018-09-20.12:08:45 |
|
Add "Investigate drop in performance of the operator-counting heuristics
(msg7584)" to summary.
|
msg7629 (view) |
Author: jendrik |
Date: 2018-09-20.12:06:36 |
|
Here are the results for lama-first and lama-first without the synergy:
https://ai.dmi.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-lama-first-30min.html
In both cases, the 32bit build solves more tasks overall than the 64bit build. Interestingly, the
versions without synergy solve 24 (32bit) and 19 (64bit) more tasks. They also have higher time
scores, but lower memory scores.
|
msg7584 (view) |
Author: florian |
Date: 2018-09-19.16:17:16 |
|
I would be interested to see if the drop in performance of the operator-counting
heuristics come from the change in CPLEX version or from our side. In the report
it was the configuration with the second-highest coverage and it had the
second-highest drop in coverage, so I think it is worth investigating.
|
msg7561 (view) |
Author: jendrik |
Date: 2018-09-19.12:28:35 |
|
We looked at the results and found that it would be good to create time and
memory profiles for the configurations using cea() and cg().
Also, we want to run an experiment for LAMA-first without the synergy.
|
msg7514 (view) |
Author: malte |
Date: 2018-09-18.14:19:20 |
|
Thanks, Jendrik! For those of us keeping score, the total differences going from
32 bits to 64 bits are currently (v7):
- opt-30min: coverage +17
- worst config: BJOLP -15; this has not integrated the latest changes, I think
- sat-30min: -coverage 38
- worst config: ff-typed -11
- anytime-30min: coverage -6, quality -7.25
- opt-5min: coverage +40
- worst config: hmax -5
- sat-5min: coverage -40
- worst config: lazy hadd -16
- anytime-5min: coverage -2, quality -2.48
Looking at coverage is good, but I think before we merge we should also look at
score_memory and score_total_time. If we want to parallelize the issue, it would
be useful if someone can have a look if we have a significant drop in performance
that can not be explained by h^max, h^add or h^FF (which are covered in issue814).
|
msg7512 (view) |
Author: jendrik |
Date: 2018-09-18.13:59:26 |
|
The remaining experiments for v7 finished. Here are all result requested in msg7470:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-opt-30min.html
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-sat-30min.html
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-lama-30min.html
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-opt-5min.html
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-sat-5min.html
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-lama-5min.html
|
msg7491 (view) |
Author: jendrik |
Date: 2018-09-17.15:40:47 |
|
I have started the remaining experiments.
|
msg7470 (view) |
Author: malte |
Date: 2018-09-15.15:32:31 |
|
Many thanks! This confirms that there is still work to do with the satisficing
configurations.
I suggest we also run an experiment with Lama (full), which I guess would need
to be separate from the non-anytime planners. So our full set of experiments for
the meta issue would have six parts:
- optimal, satisficing, anytime
- each run with a 5-minute or 30-minute timeout
If we have these six packaged together in the repository somewhere, we could
also use them in the future for similar whole-planner performance experiments.
I'm not suggesting anything fancy, just six experiment files within this issue
with a clear naming convention that contain all the experiments in one place,
i.e., include the "extra configs" etc. Something like
issue231-v7-release32-vs-release64-{opt,sat,any}-{5min,30min}.py
or whatever makes sense. That would help someone like me doing the same kinds of
experiments in the future.
Based on the sprint experience, I suggest we also run copies of the 5-minute
versions in debug mode at least once. I don't think there's a need for 30-minute
debug experiments.
|
msg7469 (view) |
Author: jendrik |
Date: 2018-09-15.09:42:08 |
|
Here are the results for the satisficing configurations using a time limit of 30 minutes:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-v7-sat-30min.html
|
msg7455 (view) |
Author: malte |
Date: 2018-09-14.11:17:39 |
|
Thanks! I would like to have all data in one place for the same revision at the
point where we actually make the transition from 32 bits to 64 bits, but we
don't need to do that right now. Because of the necessary performance
enhancements for some of the satisficing configurations, we will not close this
meta-issue during the current sprint, so there is no particular hurry.
|
msg7454 (view) |
Author: jendrik |
Date: 2018-09-14.11:01:48 |
|
The experiment does not contain the changes from issue695. If we don't run the experiment again, we can use the results from issue695 to judge whether BJOLP loses
coverage when switching to 64-bit (http://ai.cs.unibas.ch/_tmp_files/simon/issue695-v2-opt-issue695-base-issue695-v2-compare.html). These results show that
overall coverage remains almost the same (917 vs. 916 tasks).
|
msg7453 (view) |
Author: malte |
Date: 2018-09-14.10:26:50 |
|
Many thanks, Jendrik!
Some tasks in the experiment reach the hard limit for the log and are hence
aborted. We should look into this.
The BJOLP results don't look so good. Was the experiment started before or after
issue695 was merged?
Otherwise things look quite good! :-) I was surprised to see the strong coverage
of seq, until I realized that it isn't actually seq, but OCC with seq + LM-Cut
constraints. In future experiments, I'd prefer to call this "occ".
|
msg7449 (view) |
Author: jendrik |
Date: 2018-09-14.09:17:58 |
|
Here are the results for the optimal configurations with a 30 minute time limit:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-issue213-v7-opt-30min.html
|
msg7432 (view) |
Author: jendrik |
Date: 2018-09-13.14:14:56 |
|
Here are the results for the additional optimal configurations:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-issue213-v7-opt-extra-configs.html
|
msg7430 (view) |
Author: malte |
Date: 2018-09-13.12:30:49 |
|
For the experiments, we discussed offline whether it's OK to use 5 minute
timeouts throughout (as we did in most experiments here) because the experiments
are large. I think we will eventually also need to do experiments with 30-minute
timeouts.
The short experiments help us catch cases where the 64-bit becomes a lot slower,
but they help us less when the issue is running out of memory. One can try to
simulate the memory pressure by also setting a lower memory limit, but it's hard
to calibrate, and I think it would be useful to have the data for our "real"
standard time and memory settings.
This shouldn't keep up everything else: I think it's fine to first find out what
needs to improve based on 5-minute runs and do 30-minutes at leisure when the
grid is less busy.
|
msg7427 (view) |
Author: jendrik |
Date: 2018-09-13.12:14:06 |
|
The experiment using the type-based open list (with lama-first) is done:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-issue213-v7-sat-extra-configs.html
I think it's best to wait for issue814 before we try to improve the type-based config.
|
msg7425 (view) |
Author: malte |
Date: 2018-09-13.11:58:32 |
|
I changed the title of this meta issue to reflect what it is now actually about.
Also, after looking at the pull request for issue754, I realize that we have in
some sense been wrong in saying that this issue is about "switching to 64-bit".
Rather, what we are doing is switch to *native bit-size* builds only. Whether
these are 32-bit builds or 64-bit builds depends on the operating system and
compiler environment.
So we should be aware in the future that it is still possible that the code is
compiled on a 32-bit platform. For example, a long may be 32 bits only, and if
we need a 64-bit data type, we need to use something that is explicitly 64 bits.
|
msg7422 (view) |
Author: jendrik |
Date: 2018-09-13.11:45:15 |
|
Experiments for additional configs are running.
|
msg7399 (view) |
Author: malte |
Date: 2018-09-12.14:36:34 |
|
Many thanks, Jendrik! Indeed it looks like time is the issue, not memory.
So we'll have issue814 look at time and memory instead of only memory. We have
already found the wiki documentation on time profiling.
In some cases, the behaviour seems to vary quite a bit depending on which
heuristic is used. Do we have sufficient coverage of heuristics in these
experiments?
More generally, do we include all major planner components we care about? How
difficult would it be, for example, to run all the configurations used in the
IPC (including portfolios), or rather the closest approximations of these
configurations that we have in the master repository? I mean both the
satisficing and the optimal case.
It's not necessary to run the exact configurations, but if there are some larger
parts of the planner, like a specific heuristic or open list or search algorithm
used in the IPC that we don't currently cover, then it would be good to change
that. For example, I think stubborn sets can be quite time-critical, but we
don't currently cover them, right?
|
msg7397 (view) |
Author: jendrik |
Date: 2018-09-12.14:06:38 |
|
Interestingly, at least for lazy greedy add(), the reason for solving fewer tasks
is not the extra memory, but running out of time. But I guess we can hope that
using less memory makes the code more cache efficient again.
|
msg7394 (view) |
Author: jendrik |
Date: 2018-09-12.13:47:00 |
|
I added memory profiling instructions at http://www.fast-
downward.org/ForDevelopers/Profiling.
|
msg7387 (view) |
Author: malte |
Date: 2018-09-12.11:19:36 |
|
Making lazy greedy search more memory-efficient in 64-bit mode is issue814.
|
msg7385 (view) |
Author: malte |
Date: 2018-09-12.11:14:28 |
|
Thanks, Jendrik! The question is at what point we are happy to move to 64-bit.
Some possible answers:
1. When 64-bit is as good or better than 32-bit across the board (or at least
close).
2. When 64-bit is as good or better than 32-bit on average.
If we just add up the coverage scores, we currently have +16 for 64-bit in the
optimal experiment and -30 for 64-bit in the satisficing experiment, so we are
not too far away. The numbers should get better when issue695 is integrated.
Of course that doesn't mean that we should ignore the existing memory
inefficiencies. I guess the next step there should be to do a memory profile for
one of the bad configurations in 32-bit and 64-bit mode? The worst configuration
seems to be lazy_greedy_add = [u'--heuristic', u'h=add()', u'--search',
u'lazy_greedy([h], preferred=[h])']
Jendrik, can you give some for-dummies instructions of how to do the memory
profiles that we used before to diagnose this?
|
msg7383 (view) |
Author: jendrik |
Date: 2018-09-12.09:30:12 |
|
I ran two experiments:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-issue213-v7-opt.html
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-issue213-v7-sat.html
It looks like there's still work to do, especially for the satisficing configurations.
|
msg7365 (view) |
Author: jendrik |
Date: 2018-09-11.12:13:52 |
|
I will most certainly not!
|
msg7363 (view) |
Author: malte |
Date: 2018-09-11.11:29:28 |
|
Please ignore this (email test).
|
msg7356 (view) |
Author: jendrik |
Date: 2018-09-10.12:12:09 |
|
Update summary.
|
msg6258 (view) |
Author: jendrik |
Date: 2017-04-27.23:33:47 |
|
Converted to meta-issue and updated the summary.
|
msg6253 (view) |
Author: malte |
Date: 2017-04-27.19:13:15 |
|
This issue is next in the review queue, but it really looks like a meta-issue
rather than a regular one. Can you
1) convert it into a meta-issue,
2) update the summary to include the current status and open TODOs, and
3) let me know which of the subissue is the one that should be reviewed next and
update the entry in the review queue accordingly?
|
msg5905 (view) |
Author: florian |
Date: 2016-12-19.14:08:13 |
|
Added a reference to issue687.
|
msg5890 (view) |
Author: jendrik |
Date: 2016-12-14.21:25:33 |
|
Of course we can test the new hash function on multiple optimal configurations. I
don't expect any surprises if we're happy with the performance of blind search,
but you never know :-)
|
msg5887 (view) |
Author: malte |
Date: 2016-12-14.18:28:50 |
|
> Regarding the first item: bjolp uses uniform cost partitioning, so CPLEX is not
> involved. Probably you meant seq instead of bjolp. I suggest to rerun the
> experiment for seq once we've merged the new hash table implementation.
Yes, I meant seq. Regarding rerunning the experiments, does this mean that you
advocate against testing the new hash table implementation on our standard
optimal planning configurations? Since it's a rather performance-critical
change, I'd be happier to have it evaluated on more than just blind search
before merging. (But perhaps this discussion should be in issue694.)
> Regarding the second item: I think it would be good to defer experiments until
> we're happy with optimal configurations.
Fine with me. I don't think we need to test the satisficing configurations for
issue692, but I think we should for issue693 (some domains have massively larger
states, so hash function performance could be interesting) and for issue694 (for
similar reasons). But of course we can focus on optimal configurations for these
issues first.
> We could convert the lazy open list
> operator pointers in issue688. Or do you prefer a separate issue for that?
Separate issues are usually easier to review and hence faster to be merged. This
is especially true when we expect a performance impact from some of the changes
and want to know where it comes from. So I think in this case a separate issue
would be better.
|
msg5886 (view) |
Author: jendrik |
Date: 2016-12-14.18:20:58 |
|
Regarding the first item: bjolp uses uniform cost partitioning, so CPLEX is not
involved. Probably you meant seq instead of bjolp. I suggest to rerun the
experiment for seq once we've merged the new hash table implementation.
Regarding the second item: I think it would be good to defer experiments until
we're happy with optimal configurations. We could convert the lazy open list
operator pointers in issue688. Or do you prefer a separate issue for that?
|
msg5885 (view) |
Author: malte |
Date: 2016-12-14.18:07:36 |
|
Great! For this issue, I can think of two more TODO items:
- Rerun the BJOLP comparison with identical OSI/CPLEX versions. (Ideally we
should use the most recent versions for both, but IIRC, this doesn't work
because the newest CPLEX version has 32-bit issues. But perhaps we can still use
our recommended OSI version for both and couple it with an older CPLEX version
that still supports both architectures. Florian can hopefully advise on this.)
- Run experiments for satisficing configurations. If we do this later when we're
happy with the changes affecting optimal planners, we'll have to run fewer
experiments overall. If we do it earlier, we might be quicker to notice other
parts of the code we need to change before moving to 64-bit compiles. Both
options are fine for me.
Right now I can only think of one obvious change we will want to make for the
satisficing search configurations: replace the generating operator pointer in
lazy open lists with an operator ID, similar to the change for SearchNodeInfo.
We might then also want to change the interface so that it works with these IDs
and we don't have to convert back and forth between operator pointers and IDs
when we actually use IDs on both ends.
|
msg5882 (view) |
Author: jendrik |
Date: 2016-12-14.15:47:41 |
|
Yes, the experiment only compares the old hash function and hash table to the new
hash function and hash table. On the instance I tested locally, the old hash
function basically produced ascending hash values, which is very bad for
hopscotch hashing.
I have split off the issues as you suggested:
A) issue692: operator IDs instead of generating operator pointers
B) issue693: new hash function
C) issue694: new hash table implementation
D) issue695: space improvement for LM-Count
|
msg5873 (view) |
Author: malte |
Date: 2016-12-14.03:55:26 |
|
Those numbers certainly look better. :-) There still seems to be a bit of a
slowdown in most tasks, but I think that would be an acceptable penalty given
the memory improvements.
I still think we should split this into separate issues, though. For example, if
we change the hash function and hash table at the same time (which I think the
experiment does?), the changes become harder to evaluate.
> In light of these results, I suggest we polish and merge this hash table
> implementation, before we try to come up with fancier versions :-) What do
> you think?
Sounds fine to me. (Regarding alternative hash table implementations, I was
really hoping that we could use something simpler rather than something more
complex. But either way, this can wait.)
|
msg5872 (view) |
Author: jendrik |
Date: 2016-12-14.01:33:41 |
|
I analyzed and tweaked the code a bit and was able to speed it up while preserving the same memory
usage. The most important change was to not sort the old buckets during a resize. Previously, I
was trying to be smart about this to achieve a better layout, but the extra effort didn't pay off.
In fact, experiments showed that the order in which buckets are reinserted, doesn't affect memory
usage or runtime.
The resulting code version is tagged revision v5. Here are the results comparing v5 to v1, the
version using std::unordered_map:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-issue213-v1-vs-issue213-v5-release32.html
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-issue213-v1-vs-issue213-v5-release64.html
-> Memory usage and runtime decrease for both 32- and 64-bit mode. The effect is more pronounced
for memory. Coverage goes up by 7 and 18 tasks.
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-issue213-v1.html
-> With std::unordered_set, m64 solves 11 fewer tasks than m32. As we know, this is due to m64
using much more memory than m32 in v1.
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-release32-vs-release64-issue213-v5.html
-> m64 still uses more memory than m32, but this can't be due to the hash set since it should uses
the same amount of memory in both settings. The runtimes are about the same on average. Both build
types solve the same number of tasks.
In light of these results, I suggest we polish and merge this hash table implementation, before we
try to come up with fancier versions :-) What do you think?
|
msg5871 (view) |
Author: malte |
Date: 2016-12-13.17:17:10 |
|
Works for me, but the grid will be crowded then, and in an ideal world it would
be great to have some of the aspects of this already merged before then.
|
msg5870 (view) |
Author: jendrik |
Date: 2016-12-13.16:53:43 |
|
I think it's best to discuss in person during the sprint how we want to move
forward with this.
|
msg5868 (view) |
Author: malte |
Date: 2016-12-12.23:28:01 |
|
Yet another point: it looks like so far we have only experimented with optimal
planning configurations? Before merging, we also need data for satisficing planning.
|
msg5867 (view) |
Author: malte |
Date: 2016-12-12.23:27:37 |
|
Taking the wider view: I am somewhat reluctant to rush merging the new hash
implementation because of its time overhead. But I don't think this should keep
us from merging other, less critical aspects of this that affect other parts of
the code.
Suggestion: we split off the different code changes (e.g. replacing generating
operator pointers by operator IDs, new hash function, new hash table
implementation, space improvement for lmcount heuristic) into their own issues,
and we use this issue for keeping track of the relative performance of m32 vs.
m64 and the eventual switch (hopefully) to m64 in the build configs etc.
I see the following largely independent changes:
A) operator IDs instead of generating operator pointers
B) new hash function
C) new hash table implementation
D) space improvement for LM-Count
If we split these off, hopefully we can at least merge A, B and D soon.
(Of course, D still needs to be done.)
What do you think?
|
msg5866 (view) |
Author: malte |
Date: 2016-12-12.23:21:13 |
|
Thanks!
I don't think tweaking the parameter is the best solution at the moment. Unless
there is a performance issue in the implementation that we still have to
identify, perhaps this hash table implementation just is a bit slower than the
standard library one. It stands to reason that the standard library
implementation is optimized for something. Perhaps it is optimized for speed. ;-)
There are a few alternatives we can try. We wanted to avoid separate chaining
because of its space requirements, but we can also try bringing the space
requirements of separate chaining down for our particular use case by
compressing the information. Currently, the values -2 to -2^31 are unused, and
we could exploit this to use a single 4-byte cell to store either a state ID or
(the equivalent of) a pointer to a chain of collided values.
Assuming a load factor of 1.0, which is not too unreasonable when using separate
chaining, a "basic" but space-efficient implementation of separate chaining
takes 12 bytes per entry if there is no dynamic allocation overhead (which I
think is reasonable for a hash table that does not support deletion): 4 bytes
per bucket plus 8 bytes per linked list entry that holds the data.
I think that with some compression tricks, we can bring this down to less than 7
bytes per entry, i.e., the same amount of memory as with closed hashing at a
load factor of 0.57. (Closed hashing of course needs load factors to be lower
than what we can use with open hashing.)
This assumes that we don't need to store the hash keys -- if we do need to store
them, the comparison becomes more favourable for closed hashing because the
relative overhead of the "pointers" becomes smaller. I'm attaching a Python
script that allows estimating the size requirements of different hash table
implementations.
|
msg5865 (view) |
Author: jendrik |
Date: 2016-12-12.16:29:02 |
|
I profiled blind search on pegsol-08-strips:p24.pddl. In 32-bit mode revision v1
uses 7.5s and v3 uses 9.6s.
v1:
3.13% of the time is spent in StateRegistry::get_successor_state().
StateRegistry::insert_id_or_pop_state() seems to be inlined.
3.13% of the time is spent inserting states into the closed list.
v3:
18.18% of the time is spent in StateRegistry::get_successor_state().
15.16% of the time is spent in StateRegistry::insert_id_or_pop_state().
10.30% of the time is spent in IntHashSet::reserve().
3.68% of the time is spent sorting the temporary vector.
One parameter, we could try changing is "max_distance". Currently each
key can be at most 32 buckets from its ideal bucket. Raising this
number would allow for higher load factors (and to a lesser extent
fewer resizes) at the cost of longer lookups.
For v3 in 64-bit mode changing this parameter for the pegsol task produced:
max_distance: 32 64 128
load factor: 0.43 0.86 0.86
peak memory: 100 MB 70 MB 70 MB
resizes: 22 21 21
|
msg5864 (view) |
Author: florian |
Date: 2016-12-12.16:19:14 |
|
We have seen huge performance differences between OSI 0.103.0 and OSI 0.107.8.
If you want to compare 32 and 64 bit results, both revisions should use the same
version.
|
msg5863 (view) |
Author: jendrik |
Date: 2016-12-12.14:36:05 |
|
Regarding "v3: m32-vs-m64": Yes, exactly.
The experiments use the following library versions:
32-bit: OSI 0.103.0, CPLEX 12.5.1
64-bit: OSI 0.107.8, CPLEX 12.5.1
http://www.fast-downward.org/LPBuildInstructions mentions OSI 0.107.8 and CPLEX 12.6.3.
|
msg5862 (view) |
Author: malte |
Date: 2016-12-12.13:14:08 |
|
So some good news and some bad news, then!
I assume the second "v1: m32-vs-m64" line should be "v3: m32-vs-m64"?
I understand the desire to make the new m64 version look good, but a base-m32
vs. v3-m64 comparison is not meaningful for the decision we have to make in this
issue. We want to make the m64 compile competitive with the m32 compile. It
doesn't really matter if the m64 compile of revision X is competitive with the
m32 compile of revision Y.
Regarding the coverage drops for bjolp and seq:
The drop for seq may be CPLEX's fault, so not sure if we can do much about it.
Still, this probably warrants a closer look. I'd be a bit surprised to see the
32-bit version of CPLEX do massively better than the 64-bit version. Are these
otherwise identical versions of CPLEX and OSI? Are these the versions we recommend?
The drop for bjolp surprised me a bit, so I had a look how it stores a landmark
data, and it uses a PerStateInformation<vector<bool>>, which is very wasteful
because we pay the vector overhead for every single state. For our state
representation, we moved from SegmentedVector to SegmentedArrayVector for this
reason, and it made a big difference. We should probably do a similar thing
here, i.e., have something like PerStateInformation optimized for same-size
arrays. (And because this is a vector<bool>, we additionally need to pack
individual bits.)
|
msg5861 (view) |
Author: jendrik |
Date: 2016-12-12.11:56:48 |
|
Sure, here we go:
m32:
base-vs-v1: pointers and ints have the same size, so memory and runtime
don't change much. Coverage drops by 1 in total.
v1-vs-v3: memory usage is significantly reduced. Runtime results are
mixed. There's a slight negative trend for bjolp, cegar and ipdb and a
stronger negative trend for blind. Overall, all configurations benefit
from the change coverage stays the same or increases in all domains for
all configurations. Together, the configurations solve 61 more tasks.
m64:
base-vs-v1: both memory and total_time scores go up leading to 15
additional solved tasks.
v1-vs-v3: The picture is very similar to the 32-bit v1-vs-v3
comparison. Memory usage goes down, runtime goes up. The change in
memory is much more significant than the one for runtime though. In
total, coverage increases by 122 tasks.
base: m32-vs-m64 - Coverage drops for all configurations. Total coverage diff = -142
v1: m32-vs-m64 - Coverage drops for all configurations. Total coverage diff = -136
v1: m32-vs-m64 - Coverage drops by 20 and 32 for bjolp and seq, respectively. It
dropy by at most five for divpot, ipdb, lmcut and mas. It remains
the same for blind and cegar. Total coverage diff = -65
base-m32 vs. v3-m64: Memory usage increases significantly for most
configurations. Results for runtime are mixed. Coverage decreases by 12
and 33 for bjolp and seq, respectively. It increases for all other
configurations (+40 in total). Total coverage diff = -5.
|
msg5860 (view) |
Author: malte |
Date: 2016-12-11.23:42:54 |
|
That's a lot of data to go through. Can you summarize the results?
|
msg5859 (view) |
Author: jendrik |
Date: 2016-12-11.23:15:29 |
|
Thanks! Since I didn't expect you to look into this so soon, I went ahead and ran
an experiment for more configurations:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-v3-opt.tar.gz
The reports make the following comparisons:
- comare revisions base, v1 and v3 for a fixed build type.
- compare the two build types (m32 and m64) for a fixed revision.
- compare base-m32 to v3-m64
|
msg5858 (view) |
Author: malte |
Date: 2016-12-10.12:03:43 |
|
I added some comments on bitbucket.
|
msg5856 (view) |
Author: malte |
Date: 2016-12-09.11:17:37 |
|
Sounds good, but it may take a while. Lots on my plate at the moment.
|
msg5855 (view) |
Author: jendrik |
Date: 2016-12-09.09:51:58 |
|
I went over the IntHashMap code again and made some adjustments that speed
things up while using the same amount of memory. For example, each key is now
only hashed once. I'm sure there are further speed optimizations that I haven't
discovered.
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-v2-vs-v3-blind.html
Before I test other configurations, I think it would be good if you could have
a look at the code, Malte.
|
msg5854 (view) |
Author: jendrik |
Date: 2016-12-08.17:56:16 |
|
Regarding 1) in msg5851: I completely agree, this experiment was just meant to be a quick way of assessing where
we stand. I'll run more extensive experiments before merging.
Regarding 2) in msg5851: I have repeated the experiment for 32-bit builds:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-v2-blind-m32-issue213-v1-issue213-v2-compare.html
Here is the complete coverage table for blind search:
version coverage
------------------
base-m32: 617
v1-m32: 617
v2-m32: 624
base-m64: 602
v1-m64: 606
v2-m64: 624
|
msg5853 (view) |
Author: malte |
Date: 2016-12-08.17:14:10 |
|
Regarding Andrew's suggestion, this is indeed another optimization we discussed
and should test at some point. It's already part of the longer-term plan (see
the SearchNodeInfo row and "ideal" column in msg5843).
I generally like doing this one optimization at a time because I think that
makes it easier to evaluate the impact of the various changes on memory,
coverage and runtime, but we should certainly not forget about this optimization
opportunity.
Let me point out that for certain search algorithms (greedy BFS without
reopening), we can also get rid of the g values altogether.
|
msg5852 (view) |
Author: malte |
Date: 2016-12-08.17:10:25 |
|
Hi Andrew, you're not the first (or second, or third ;-)) one to be tripped up
by the new "optional summary" field. Pasting your comment as a change note:
=========
Jendrik - re the changes for v1 - is there a need to store the parent operator
at all? A while back I stripped this out of a fork of FD, and when a goal state
was reached, derived what the operators must have been by looking at the
differences between states on the path to the goal. This has a small
postprocessing cost, but saves memory.
|
msg5851 (view) |
Author: malte |
Date: 2016-12-08.15:02:34 |
|
So far, so good! It would be nice to reduce the runtime penalty, but if it turns
out that we cannot, I don't think it would be a deal-breaker. Hopefully the
overhead will be much less noticeable with non-blind search settings.
I have two issues with the experiment:
1) If I understand correctly, the "base" numbers come from a separate
experiment. For the same reason that we randomize run IDs in our experiments, we
can get systematic errors if we combine data from two separate experiments. This
is less of an issue here than usually because these erros only affect runtime,
not memory usage. But I think we should still run a "clean" (all-in-one)
experiment before merging.
2) For a complete picture, we should have 32-bit and 64-bit data for all
configurations. That is, we should have data for v1-m32 and v2-m64, too.
|
msg5850 (view) |
Author: jendrik |
Date: 2016-12-08.14:53:00 |
|
Two changes for reducing memory consumption have been made in the pull request so far:
issue213-v1 uses ints instead of GlobalOperator pointers for storing the parent operator.
issue213-v2 (in addition to changes by v1) uses IntHashSet instead of std::unordered_set for storing the closed-list.
I have run an experiment comparing v1 and v2 for blind search in a 64-bit build:
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-v2-blind-m64-issue213-v1-issue213-v2-compare.html
Memory consumption is significantly reduced, but runtimes go up.
v2 recovers the loss in total coverage for blind search incurred by switching to 64-bit builds:
version coverage
------------------
base-m32: 617
base-m64: 602
v1-m64: 606
v2-m64: 624
|
msg5849 (view) |
Author: jendrik |
Date: 2016-12-05.11:13:42 |
|
Concerning the CPLEX error: we decided to postpone fixing this until we find a
reliable way to test the fix and opened issue691 for this.
|
msg5847 (view) |
Author: florian |
Date: 2016-12-05.10:51:20 |
|
> We could exit with EXIT_OUT_OF_MEMORY if the error message matches "CPXlpopt
> returned error 1001" (1001 stands for out of memory in the CPLEX lib). Not
> sure if that's the best solution though.
Sounds good. This is also the way we handle other error messages that come
through unexpected channels (e.g., OSI reports some CPLEX errors as warnings).
|
msg5846 (view) |
Author: jendrik |
Date: 2016-12-05.01:01:48 |
|
Minor remark: the task tested in msg5843 is actually elevators-opt11-strips:p04.pddl. In
revision issue213-base it uses 102 MB in a 32-bit build and 152 MB in a 64-bit build.
I have implemented a new hash table class using open addressing instead of chaining. It
decreases peak memory usage to 80 MB in 32-bit builds and to 82 MB in 64-bit builds.
A critical ingredient was a new hash function. The old hash function
(utils/hash.h:hash_sequence()) produced too many collisions, resulting in a final load
factor of 0.05 and using ~400 MB (peak memory). Switching to Jenkins' one-at-a-time hash
function (https://en.wikipedia.org/wiki/Jenkins_hash_function) increased the final load
factor to 0.79.
Since coming up with his one-at-a-time hash function, Jenkins has released multiple
improved hash functions under the public domain. I will try hashword() from
http://www.burtleburtle.net/bob/c/lookup3.c tomorrow. This seems to be one of the best
non-cryptographic hash functions for variable-length integer sequences. He has also
released SpookyHash, but it only works for 64-bit builds.
It would be great if someone could have a look at the implementation of IntHashMap. I
opened a pull request at https://bitbucket.org/jendrikseipp/downward/pull-requests/63 .
|
msg5844 (view) |
Author: malte |
Date: 2016-11-30.19:16:01 |
|
One small addition: the total numbers in the last row of Jendrik's table assume
that the number of open list entries and the number of generated states are of
the same order of magnitude. For tasks with many duplicates, it is possible that
there are arbitrarily more open list entries, so that the open lists can become
the dominating factor for memory usage.
If we pursue the goal of minimizing memory usage, it would make sense to
introduce measurements to the planner that compare the maximum number of open
list entries with the number of generated states. If it turns out that open
lists are sometimes the bottleneck, we can rethink our current (quite lazy)
duplicate elimination scheme.
|
msg5843 (view) |
Author: jendrik |
Date: 2016-11-30.19:10:59 |
|
Malte and I analyzed the memory profiles of A* + blind search for pegsol-opt11-strips:p03.pddl today and noted
where we use how much memory in a 32-bit, 64-bit and ideal version. Here are the numbers:
A* + blind search 32-bit 64-bit ideal
---------------------------------------------------------------------------------------------------------------
PerStateInformation<SearchNodeInfo> 16 Bytes 24 Bytes 8 Bytes (parent state + g)
(parent state, creating operator, g, real g)
PerStateInformation<HEntry> 4 Bytes 4 Bytes 0 Bytes (for blind search)
(h-cache)
State registry - state data 4*B Bytes 4*B Bytes 4*B Bytes (B = #buckets)
(h-cache)
State registry - hash table ~17 Bytes ~34 Bytes ~5 Bytes (4 Bytes + overhead)
(h-cache)
Open list 4 Bytes 4 Bytes 4 Bytes (per open-list entry)
---------------------------------------------------------------------------------------------------------------
Total ~45 Bytes ~70 Bytes ~21 Bytes (in example with B=1)
We have already started to reduce the size of SearchNodeInfo by storing operator IDs (ints) instead of
GlobalOperator pointers. Next, we will be looking into more memory-efficient hash table implementations, probably
using open addressing instead of separate chaining like std::unordered_set.
|
msg5842 (view) |
Author: jendrik |
Date: 2016-11-30.00:28:34 |
|
I agree. After looking at the code again, my fix make no sense. The error
message apparently comes from COIN, not CPLEX. Our code detects this, calls
handle_coin_error() which prints "Coin threw exception: ...". We could exit
with EXIT_OUT_OF_MEMORY if the error message matches "CPXlpopt returned error
1001" (1001 stands for out of memory in the CPLEX lib). Not sure if that's the
best solution though.
|
msg5841 (view) |
Author: florian |
Date: 2016-11-29.21:34:00 |
|
Hard to say. The other error messages look more like the warning you saw on
stdout. So, maybe this is communicated over a different channel (wouldn't
surprise me). Is the "Coin threw exception:" stuff from our code? It would be
better if we were sure that the error actually passes through the error handler
that you modified. If you cannot reproduce the error, I would rather leave the
handler as it is than change it and hope that the change is correct.
|
msg5840 (view) |
Author: jendrik |
Date: 2016-11-29.21:18:41 |
|
I failed to reproduce the error on both my machine and maia. So I'm not sure how to test my fix. I pushed it to my
bitbucket repo:
https://bitbucket.org/jendrikseipp/downward/commits/d19cd04fa4bb1547524573669ac83170c050cd5b?at=default
Florian, do you think this fix is correct?
|
msg5838 (view) |
Author: florian |
Date: 2016-11-29.14:45:21 |
|
The message "Compressing row and column files." is unrelated (that is just
CPLEX's last effort to conserve memory). We would have to catch error 1001 in
our custom error handler before it throws the exception.
|
msg5837 (view) |
Author: jendrik |
Date: 2016-11-29.14:38:37 |
|
Concerning the critical error:
The output on stdout is
CPX0000 Compressing row and column files.
The output on stderr is
Coin threw exception: CPXlpopt returned error 1001
from method resolve
from class OsiCpxSolverInterface
Unexplained error occurred.
(The last line is from Fast Downward.)
Interestingly, the file "OsiDefaultName.xxx" is written to disk, containing a
textual represenation of an LP.
|
msg5835 (view) |
Author: malte |
Date: 2016-11-29.14:14:50 |
|
It's a pity that the data isn't better. It's hard to justify moving to 64 bits
with such a large increase in memory usage (and loss in coverage). I think with
these numbers, we should do something about memory usage.
I suggest we focus on blind search first. There may be other memory-wasters
hidden inside certain heuristics etc., but whatever is making blind search use
much more memory will affect all configurations that expand/evaluate/generate
many states.
Of course we should measure things before proceeding, but my best guess at the
moment is that the hash table (unordered_set) inside StateRegistry is to blame
since I cannot think of any other major data structures for which it is
plausible that they require much more memory in a 64-bit compile.
I would suggest that we
1) do a memory profile on some representative blind search test cases, in 32-bit
mode and 64-bit mode
2) look more closely at the main memory users and their underlying
implementation to see how they differ in 32-bit and 64-bit mode
3) think of more memory-efficient implementations
Perhaps the answer is simply to use another ready-made hash table
implementation, but I think it may be worthwhile to understand this more deeply
before we design a solution.
|
msg5834 (view) |
Author: florian |
Date: 2016-11-29.14:03:22 |
|
Looks like a huge satellite task running out of memory in an LP configuration. I
suspect CPLEX throws an error that we don't recognize as an out-of-memory
memory. This happened before, and we already have special casing for some of
these errors. Jendrik, if can figure out the exact error message from the logs,
we can add a special case for it.
|
msg5831 (view) |
Author: malte |
Date: 2016-11-29.13:57:52 |
|
Should we look into the critical error in that experiment?
|
msg5828 (view) |
Author: jendrik |
Date: 2016-11-29.11:54:22 |
|
Due to newer CPLEX versions removing support for 32-bit builds, the discussion
about 64-bit builds came up again. I repeated Florian's experiment on the bigger
benchmark set and with some new optimal configurations. The general picture
remains the same though. Coverage decreases in many domains for all configs.
http://ai.cs.unibas.ch/_tmp_files/seipp/issue213-opt-comparison.html
The question is whether and how we want to act on these results. This may be
something we should discuss in a Fast Downward meeting.
|
msg4128 (view) |
Author: jendrik |
Date: 2015-03-31.13:38:22 |
|
issue436 (removing GNU hash sets and maps) has been addressed now, so work on
this issue could continue.
|
msg3882 (view) |
Author: andrew.coles |
Date: 2014-10-27.14:01:11 |
|
(Background:
The standard hash_set implementation is a hashtable. The hashtable is stored as
a std::vector, where each index contains a singly-linked list: an item of data,
and a 'next' pointer to the next item in the list. )
It's also possible to represent the lists in each bucket of a hash set by moving
the next pointer into the state. Normally this wouldn't be done -- we can only
have one hash set implemented in this way -- but the state registry is special.
In implementation terms, this can be done by making the PackedStateBin one int
larger, and instead of storing the 'next' entry as a pointer, store it as the
StateID of the next state. The hash bucket themselves are then StateIDs: -1 if
empty, or the StateID of the first state on the list of states in that bucket.
Storing 'next' as part of the PackedStateBin gives a 25% reduction in memory
usage on 32-bit builds - see attached state_registry_low_memory.tar.gz. The
downside is a drop in performance: blind search is a factor of 2 to 3 slower.
(This surprised me, and might be improvable with some profiling.) The % time
overhead would of course be lower with something other than blind search.
|
msg3881 (view) |
Author: florian |
Date: 2014-10-27.11:20:12 |
|
Jendrik already added them (hash sets etc. are handled in issue436).
|
msg3880 (view) |
Author: malte |
Date: 2014-10-27.11:18:00 |
|
Worth adding crossreferences between the issues?
|
msg3879 (view) |
Author: florian |
Date: 2014-10-27.11:17:39 |
|
Ahh, interesting. This looks very similar to the way we store state data in the
state registry. Maybe we can reuse some code here. But I think we should wait
until we changed the hash sets to the c++11 variants.
|
msg3878 (view) |
Author: andrew.coles |
Date: 2014-10-27.00:45:25 |
|
I've been looking into memory management recently*, and various hash set/hash
table options, and recalled there was a thread here on that topic.
You might want to try the attached replacement for the standard C++ allocator,
for use where objects will be persistent - e.g. nodes inside data structures.
In state_registry.h switch the definition of StateIDSet to:
typedef __gnu_cxx::hash_set<StateID,
StateIDSemanticHash,
StateIDSemanticEqual,
ChunkAllocator<StateID> > StateIDSet;
A quick unscientific experiment doing this on a 32-bit build shows a 17% or so
reduction in memory usage.
* In IPC2011, POPF used 6GB of memory in 25s, so this is long overdue
|
msg3730 (view) |
Author: jendrik |
Date: 2014-10-08.21:30:42 |
|
I added a note in issue436.
|
msg3727 (view) |
Author: florian |
Date: 2014-10-08.20:37:02 |
|
I ran some memory profiles a while back and if I remember correctly, the hash
set entries contributed a large part to the total memory usage, so this seems
plausible. I vote to leave this issue open, add a reference to it on the issue
for hash sets (does it exist yet?) and do a memory profile once we finished
working on the hash sets.
|
msg3714 (view) |
Author: malte |
Date: 2014-10-08.12:10:23 |
|
Interesting and mildly surprising. The only reasonable explanation I can up with
for something like astar(blind()) is that this is due to the central hash_set in
the state registry. We might want to leave this one open and look at it again
when we revisit the hash tables, which is something we had on the screen anyway.
Or we might just close it as something that is the way it is. Preferences?
|
msg3702 (view) |
Author: florian |
Date: 2014-10-08.09:04:47 |
|
Seems like this is still an issue with the new state representation:
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue213-base-issue213-base64-compare.html
|
msg3669 (view) |
Author: malte |
Date: 2014-10-06.12:20:19 |
|
I suggest blind(), BJOLP, lmcut(), ipdb() and our two common merge-and-shrink
configurations, on the optimal (incl. ipc11) benchmark suite.
|
msg3668 (view) |
Author: florian |
Date: 2014-10-06.12:16:55 |
|
> Florian, how much work would it be to investigate the time and memory of -m32
> vs. -m64 for some standard configurations?
Not too much. I could create two experimental branches, modify the Makefile in
them and set up an experiment for this. Which configs/tasks do you want to test?
|
msg3636 (view) |
Author: malte |
Date: 2014-10-04.19:42:03 |
|
I'm adding Florian to this issue as our "master state memory manager". I wonder
if this is still an issue with our new way of storing state information. Perhaps
we can get rid of "-m32"?
Florian, how much work would it be to investigate the time and memory of -m32
vs. -m64 for some standard configurations?
|
msg1608 (view) |
Author: malte |
Date: 2011-08-13.20:17:45 |
|
I made some more tests, which show that the behaviour already shows up with
blind search. Here's the relevant data; everything else (no. expanded states,
no. hash buckets in the closed list etc.) was reported identically with -m32 and
-m64.
For reference, this is from my new desktop machine (Xeon E31270 running 64-bit
Ubuntu 10.10 with gcc 4.4.4-14ubuntu5).
./downward-1 --search 'astar(blind())' < output:
woodworking-opt #12:
-m32:
Total time: 20.28s
Peak memory: 345372 KB
-m64:
Total time: 18.25s
Peak memory: 501048 KB
woodworking-opt #23:
-m32:
Total time: 11.67s
Peak memory: 214460 KB
-m64:
Total time: 10.69s
Peak memory: 309812 KB
blocks #9-0:
-m32:
Total time: 24.07s
Peak memory: 497516 KB
-m64:
Total time: 22.61s
Peak memory: 866388 KB
|
msg1607 (view) |
Author: malte |
Date: 2011-08-13.19:12:29 |
|
For future reference, here's the relevant excerpt of the original email that
discussed -m32 vs. -m64 memory usage:
===========================================================================
as a sanity test for our packaging, I did some experiments with the
final seq-opt-bjolp package this night, and noticed that it lost 17
(IIRC) problems compared to my previous tests, which is rather a lot.
Looking into this a bit more deeply, this seems to be due to -m64 using
a lot more memory than -m32. My tests were run with 2 GB (so this would
not be such an issue with the 6 GB in the competition), but still the
memory usage difference is rather large, and maybe we should do
something about this in the long run. I don't know if this is typical
behaviour or specific to LM configurations.
Example for the four toughest solved tasks in woodworking:
#04: uses 737 MB with -m32, 1192 MB with -m64
#14: uses 1277 MB with -m32, runs out of memory with -m64
#15: uses 1583 MB with -m32, runs out of memory with -m64
#24: uses 322 MB with -m32, 514 MB with -m64
I think there's nothing we could or should do about this for the
competition, but maybe it's worth opening an issue on this for later.
===========================================================================
|
msg1220 (view) |
Author: malte |
Date: 2011-01-23.11:29:53 |
|
It's true that the LM graph storage is not particularly efficient, but that
cannot explain memory usage in the gigabytes. In fact, I'd be surprised if the
landmark graph contributed even a single megabyte here. We should really
optimize where it matters first, and that means reducing *per-state* memory
cost, not per-problem memory cost. That means we can safely ignore 1) and 2),
although 3) is indeed an issue.
Also, going from vector<int> to vector<state_t> etc. on vectors that only
usually have one or two elements as in point 1) saves next to nothing (or even
exactly nothing due to memory alignment issues); the vector and dynamic-memory
management overhead dwarfs everything else here. Feel free to try the
optimizations in 1) and 2) out, since seeing is believing :-).
Note that so far we have no evidence that this memory explosion only happens for
the landmark configurations; they might happen everywhere. Landmark
configurations were the only ones I had the chance to look at so far.
|
msg1219 (view) |
Author: erez |
Date: 2011-01-23.11:11:44 |
|
The landmarks code is not very efficient with memory, and this is compounded in
64-bit mode. Examples with BJOLP configuration that Malte ran:
Woodworking #04: uses 737 MB with -m32, 1192 MB with -m64
Woodworking #14: uses 1277 MB with -m32, runs out of memory with -m64
Woodworking #15: uses 1583 MB with -m32, runs out of memory with -m64
Woodworking #24: uses 322 MB with -m32, 514 MB with -m64
Here are some places where the landmarks code uses lots of memory:
1. LandmarkNode has:
vector<int> vars
vector<int> vals
set<int> first_achievers
set<int> possible_achievers
hash_set<pair<int, int>, hash_int_pair> forward_orders
Should be -
vector<var_num_t> vars
vector<state_var_t> vals
set<operator_num_t> first_achievers
set<operator_num_t> possible_achievers
hash_set<pair<var_num_t, state_var_t>, hash_int_pair> forward_orders
If var_num_t, operator_num_t and state_var_t are all 32 bits, that would reduce
the memory footprint of the landmarks by almost a half (there are some other
members).
2. LandmarksGraph has:
vector<int> empty_pre_operators
vector<vector<vector<int> > > operators_eff_lookup
vector<vector<vector<int> > > operators_pre_lookup
vector<vector<set<pair<int, int> > > > inconsistent_facts
hash_map<pair<int, int>, LandmarkNode *, hash_int_pair> simple_lms_to_nodes
hash_map<pair<int, int>, LandmarkNode *, hash_int_pair> disj_lms_to_nodes
hash_map<pair<int, int>, Pddl_proposition, hash_int_pair> pddl_propositions
should be -
vector<operator_num_t> empty_pre_operators
vector<vector<vector<operator_num_t> > > operators_eff_lookup
vector<vector<vector<operator_num_t> > > operators_pre_lookup
vector<vector<set<pair<num_vars_t, state_var_t> > > > inconsistent_facts
hash_map<pair<num_vars_t, state_var_t>, LandmarkNode *, hash_int_pair>
simple_lms_to_nodes
hash_map<pair<num_vars_t, state_var_t>, LandmarkNode *, hash_int_pair>
disj_lms_to_nodes
hash_map<pair<num_vars_t, state_var_t>, Pddl_proposition, hash_int_pair>
pddl_propositions
3. LandmarkStatusManager uses StateProxy. If we add an ID for each state, we can
replace this with a number, which might use less memory.
|
|
Date |
User |
Action |
Args |
2019-01-21 13:58:32 | malte | set | messages:
+ msg8494 |
2019-01-21 13:56:24 | jendrik | set | status: chatting -> resolved messages:
+ msg8493 summary: TODOs before we can close this issue:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds with 5 and 30 minute timeouts (this issue): done
- Make relaxation heuristics more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Run 5 minute experiments in debug mode: done
- Update wiki docs: done
TODOs for later:
- Reduce memory usage of causal graph heuristic cache (issue838)
- Improve performance of cea (use fewer pointers?)
- Improve performance of cg? (perhaps performance is already OK, but similar
ideas as for h^add and/or h^cea should be directly applicable)
- Improve performance of LM-Cut (similar ideas as for h^add etc. could be applied) -> TODOs before we can close this issue:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds with 5 and 30 minute timeouts (this issue): done
- Make relaxation heuristics more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Run 5 minute experiments in debug mode: done
- Update wiki docs: done |
2019-01-21 12:40:05 | jendrik | set | messages:
+ msg8488 |
2019-01-21 12:33:27 | malte | set | status: resolved -> chatting messages:
+ msg8486 |
2019-01-18 17:10:31 | jendrik | set | status: in-progress -> resolved messages:
+ msg8473 summary: TODOs before we can close this issue:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds with 5 and 30 minute timeouts (this issue)
- Make relaxation heuristics more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Run 5 minute experiments in debug mode
TODOs for later:
- Reduce memory usage of causal graph heuristic cache (issue838)
- Improve performance of cea (use fewer pointers?)
- Improve performance of cg? (perhaps performance is already OK, but similar
ideas as for h^add and/or h^cea should be directly applicable)
- Improve performance of LM-Cut (similar ideas as for h^add etc. could be applied) -> TODOs before we can close this issue:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds with 5 and 30 minute timeouts (this issue): done
- Make relaxation heuristics more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Run 5 minute experiments in debug mode: done
- Update wiki docs: done
TODOs for later:
- Reduce memory usage of causal graph heuristic cache (issue838)
- Improve performance of cea (use fewer pointers?)
- Improve performance of cg? (perhaps performance is already OK, but similar
ideas as for h^add and/or h^cea should be directly applicable)
- Improve performance of LM-Cut (similar ideas as for h^add etc. could be applied) |
2018-12-15 17:33:39 | malte | set | messages:
+ msg8364 |
2018-12-13 23:04:46 | jendrik | set | messages:
+ msg8310 |
2018-12-13 20:56:17 | jendrik | set | messages:
+ msg8307 |
2018-12-13 20:27:32 | malte | set | messages:
+ msg8306 |
2018-12-13 20:24:37 | jendrik | set | messages:
+ msg8304 |
2018-12-13 19:40:02 | malte | set | messages:
+ msg8302 summary: TODOs before we can close this issue:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds with 5 and 30 minute timeouts (this issue)
- Make relaxation heuristics more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Run 5 minute experiments in debug mode
TODOs for later:
- Reduce memory usage of causal graph heuristic cache (issue838)
- Improve performance of cea (use fewer pointers?) -> TODOs before we can close this issue:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds with 5 and 30 minute timeouts (this issue)
- Make relaxation heuristics more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Run 5 minute experiments in debug mode
TODOs for later:
- Reduce memory usage of causal graph heuristic cache (issue838)
- Improve performance of cea (use fewer pointers?)
- Improve performance of cg? (perhaps performance is already OK, but similar
ideas as for h^add and/or h^cea should be directly applicable)
- Improve performance of LM-Cut (similar ideas as for h^add etc. could be applied) |
2018-12-13 19:38:33 | malte | set | messages:
+ msg8301 |
2018-12-13 13:19:17 | florian | set | messages:
+ msg8285 |
2018-12-13 12:57:39 | jendrik | set | messages:
+ msg8284 |
2018-12-13 12:38:22 | jendrik | set | messages:
+ msg8283 summary: TODOs before we can close this issue:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds with 5 and 30 minute timeouts (this issue)
- Make relaxation heuristics more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Run 5 minute experiments in debug mode
- Investigate drop in performance of the operator-counting heuristics (msg7584)
TODOs for later:
- Reduce memory usage of causal graph heuristic cache (issue838)
- Improve performance of cea (use fewer pointers?) -> TODOs before we can close this issue:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds with 5 and 30 minute timeouts (this issue)
- Make relaxation heuristics more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Run 5 minute experiments in debug mode
TODOs for later:
- Reduce memory usage of causal graph heuristic cache (issue838)
- Improve performance of cea (use fewer pointers?) |
2018-12-06 15:38:50 | jendrik | set | messages:
+ msg8166 summary: Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds (this issue)
- Make lazy greedy search more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Before closing this, run 5 minute experiments in debug mode
- Investigate drop in performance of the operator-counting heuristics (msg7584)
- Reduce memory usage of causal graph heuristic cache (issue838)
- Improve performance of cea (use fewer pointers?) -> TODOs before we can close this issue:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds with 5 and 30 minute timeouts (this issue)
- Make relaxation heuristics more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Run 5 minute experiments in debug mode
- Investigate drop in performance of the operator-counting heuristics (msg7584)
TODOs for later:
- Reduce memory usage of causal graph heuristic cache (issue838)
- Improve performance of cea (use fewer pointers?) |
2018-12-06 15:28:55 | jendrik | set | nosy:
+ cedric messages:
+ msg8165 summary: Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds (this issue)
- Make lazy greedy search more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Before closing this, run 5 minute experiments in debug mode
- Investigate drop in performance of the operator-counting heuristics (msg7584)
- Reduce memory usage of causal graph heuristic cache (issue838) -> Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds (this issue)
- Make lazy greedy search more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Before closing this, run 5 minute experiments in debug mode
- Investigate drop in performance of the operator-counting heuristics (msg7584)
- Reduce memory usage of causal graph heuristic cache (issue838)
- Improve performance of cea (use fewer pointers?) |
2018-09-20 15:57:22 | malte | set | messages:
+ msg7650 |
2018-09-20 15:21:02 | jendrik | set | messages:
+ msg7647 summary: Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds (this issue)
- Make lazy greedy search more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Before closing this, run 5 minute experiments in debug mode.
- Investigate drop in performance of the operator-counting heuristics (msg7584) -> Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds (this issue)
- Make lazy greedy search more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Before closing this, run 5 minute experiments in debug mode
- Investigate drop in performance of the operator-counting heuristics (msg7584)
- Reduce memory usage of causal graph heuristic cache (issue838) |
2018-09-20 14:51:42 | jendrik | set | messages:
+ msg7645 |
2018-09-20 14:36:47 | jendrik | set | messages:
+ msg7642 |
2018-09-20 13:18:15 | jendrik | set | messages:
+ msg7635 |
2018-09-20 12:42:15 | malte | set | messages:
+ msg7632 |
2018-09-20 12:08:45 | jendrik | set | messages:
+ msg7630 summary: Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds (this issue)
- Make lazy greedy search more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Before closing this, run 5 minute experiments in debug mode. -> Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds (this issue)
- Make lazy greedy search more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Before closing this, run 5 minute experiments in debug mode.
- Investigate drop in performance of the operator-counting heuristics (msg7584) |
2018-09-20 12:06:36 | jendrik | set | messages:
+ msg7629 |
2018-09-19 16:17:16 | florian | set | messages:
+ msg7584 |
2018-09-19 12:28:35 | jendrik | set | messages:
+ msg7561 |
2018-09-18 14:19:20 | malte | set | messages:
+ msg7514 |
2018-09-18 13:59:26 | jendrik | set | messages:
+ msg7512 |
2018-09-17 15:40:47 | jendrik | set | messages:
+ msg7491 summary: Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds (this issue)
- Make lazy greedy search more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754) -> Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds (this issue)
- Make lazy greedy search more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754)
- Before closing this, run 5 minute experiments in debug mode. |
2018-09-17 08:45:11 | silvan | set | nosy:
+ silvan |
2018-09-15 15:32:31 | malte | set | messages:
+ msg7470 |
2018-09-15 09:42:08 | jendrik | set | messages:
+ msg7469 |
2018-09-14 11:17:39 | malte | set | messages:
+ msg7455 |
2018-09-14 11:01:48 | jendrik | set | messages:
+ msg7454 |
2018-09-14 10:26:50 | malte | set | messages:
+ msg7453 |
2018-09-14 09:17:58 | jendrik | set | messages:
+ msg7449 |
2018-09-13 14:14:56 | jendrik | set | messages:
+ msg7432 |
2018-09-13 12:30:49 | malte | set | messages:
+ msg7430 |
2018-09-13 12:14:06 | jendrik | set | messages:
+ msg7427 |
2018-09-13 11:58:32 | malte | set | messages:
+ msg7425 title: inefficient memory use with -m64 setting -> Remove 32-bit and 64-bit build options. |
2018-09-13 11:45:15 | jendrik | set | status: chatting -> in-progress messages:
+ msg7422 summary: Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in
32-bit builds (this issue)
- Make lazy greedy search more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754) -> Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in 32-bit builds (this issue)
- Make lazy greedy search more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754) |
2018-09-12 14:36:34 | malte | set | messages:
+ msg7399 summary: Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in
32-bit builds (this issue)
- Make lazy greedy search more memory-efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754) -> Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in
32-bit builds (this issue)
- Make lazy greedy search more efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754) |
2018-09-12 14:06:38 | jendrik | set | messages:
+ msg7397 |
2018-09-12 13:47:00 | jendrik | set | messages:
+ msg7394 |
2018-09-12 11:19:37 | malte | set | messages:
+ msg7387 summary: Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in
32-bit builds (this issue)
- Remove option for 32-bit builds from the build system (issue754) -> Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in
32-bit builds (this issue)
- Make lazy greedy search more memory-efficient in 64-bit mode (issue814)
- Remove option for 32-bit builds from the build system (issue754) |
2018-09-12 11:14:28 | malte | set | messages:
+ msg7385 |
2018-09-12 09:30:12 | jendrik | set | messages:
+ msg7383 |
2018-09-11 12:13:52 | jendrik | set | messages:
+ msg7365 summary: Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as in
32-bit builds (this issue)
- Remove option for 32-bit builds from the build system (issue754) -> Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as
in
32-bit builds (this issue)
- Remove option for 32-bit builds from the build system (issue754) |
2018-09-11 11:29:28 | malte | set | messages:
+ msg7363 |
2018-09-10 12:12:09 | jendrik | set | messages:
+ msg7356 summary: Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
After this is done, we should consider changing the build system (e.g., make
64-bit the default build, or remove the option altogether). This is issue754. -> Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
- Test that most important configurations perform as well in 64-bit builds as in
32-bit builds (this issue)
- Remove option for 32-bit builds from the build system (issue754) |
2017-12-01 10:29:11 | jendrik | set | summary: Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
After this is done, we should consider changing the build system (e.g., make
64-bit the default build, or remove the option altogether). This is related to
issue687. -> Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
After this is done, we should consider changing the build system (e.g., make
64-bit the default build, or remove the option altogether). This is issue754. |
2017-04-27 23:33:47 | jendrik | set | priority: feature -> meta messages:
+ msg6258 summary: After this is done, we should consider changing the build system (e.g., make
64-bit the default build, or remove the option altogether). This is related to
issue687. -> Subissues:
- Use operator IDs instead of generating operator pointers (issue692)
- Implement new hash function (issue693)
- Implement new hash table (issue694) (blocked by issue693)
- Store landmarks more efficiently (issue695)
After this is done, we should consider changing the build system (e.g., make
64-bit the default build, or remove the option altogether). This is related to
issue687. |
2017-04-27 19:13:15 | malte | set | messages:
+ msg6253 |
2016-12-19 14:08:13 | florian | set | messages:
+ msg5905 summary: After this is done, we should consider changing the build system (e.g., make
64-bit the default build, or remove the option altogether). This is related to
issue687. |
2016-12-14 21:25:33 | jendrik | set | messages:
+ msg5890 |
2016-12-14 18:28:50 | malte | set | messages:
+ msg5887 |
2016-12-14 18:20:58 | jendrik | set | status: reviewing -> chatting messages:
+ msg5886 |
2016-12-14 18:07:36 | malte | set | messages:
+ msg5885 |
2016-12-14 15:47:41 | jendrik | set | messages:
+ msg5882 |
2016-12-14 03:55:26 | malte | set | messages:
+ msg5873 |
2016-12-14 01:33:41 | jendrik | set | messages:
+ msg5872 |
2016-12-13 17:17:10 | malte | set | messages:
+ msg5871 |
2016-12-13 16:53:43 | jendrik | set | messages:
+ msg5870 |
2016-12-12 23:28:01 | malte | set | messages:
+ msg5868 |
2016-12-12 23:27:37 | malte | set | messages:
+ msg5867 |
2016-12-12 23:21:13 | malte | set | files:
+ hash_sizes.py messages:
+ msg5866 |
2016-12-12 16:29:02 | jendrik | set | messages:
+ msg5865 |
2016-12-12 16:19:14 | florian | set | messages:
+ msg5864 |
2016-12-12 14:36:05 | jendrik | set | messages:
+ msg5863 |
2016-12-12 13:14:08 | malte | set | messages:
+ msg5862 |
2016-12-12 11:56:48 | jendrik | set | messages:
+ msg5861 |
2016-12-11 23:42:54 | malte | set | messages:
+ msg5860 |
2016-12-11 23:15:29 | jendrik | set | messages:
+ msg5859 |
2016-12-10 12:03:43 | malte | set | messages:
+ msg5858 |
2016-12-09 11:17:37 | malte | set | messages:
+ msg5856 |
2016-12-09 09:51:58 | jendrik | set | messages:
+ msg5855 |
2016-12-08 17:56:16 | jendrik | set | messages:
+ msg5854 |
2016-12-08 17:14:10 | malte | set | messages:
+ msg5853 |
2016-12-08 17:10:25 | malte | set | messages:
+ msg5852 summary: Jendrik - re the changes for v1 - is there a need to store the parent operator
at all? A while back I stripped this out of a fork of FD, and when a goal state
was reached, derived what the operators must have been by looking at the
differences between states on the path to the goal. This has a small
postprocessing cost, but saves memory. -> (no value) |
2016-12-08 17:06:09 | andrew.coles | set | summary: Jendrik - re the changes for v1 - is there a need to store the parent operator
at all? A while back I stripped this out of a fork of FD, and when a goal state
was reached, derived what the operators must have been by looking at the
differences between states on the path to the goal. This has a small
postprocessing cost, but saves memory. |
2016-12-08 15:02:34 | malte | set | messages:
+ msg5851 |
2016-12-08 14:53:00 | jendrik | set | messages:
+ msg5850 |
2016-12-05 11:13:42 | jendrik | set | messages:
+ msg5849 |
2016-12-05 10:51:20 | florian | set | messages:
+ msg5847 |
2016-12-05 01:01:48 | jendrik | set | status: chatting -> reviewing messages:
+ msg5846 |
2016-11-30 19:16:01 | malte | set | messages:
+ msg5844 |
2016-11-30 19:10:59 | jendrik | set | assignedto: jendrik messages:
+ msg5843 |
2016-11-30 00:28:34 | jendrik | set | messages:
+ msg5842 |
2016-11-29 21:34:00 | florian | set | messages:
+ msg5841 |
2016-11-29 21:18:41 | jendrik | set | messages:
+ msg5840 |
2016-11-29 14:45:21 | florian | set | messages:
+ msg5838 |
2016-11-29 14:38:37 | jendrik | set | messages:
+ msg5837 |
2016-11-29 14:14:50 | malte | set | messages:
+ msg5835 |
2016-11-29 14:03:22 | florian | set | messages:
+ msg5834 |
2016-11-29 13:57:52 | malte | set | messages:
+ msg5831 |
2016-11-29 11:54:22 | jendrik | set | messages:
+ msg5828 |
2015-03-31 13:38:22 | jendrik | set | messages:
+ msg4128 |
2014-10-27 14:01:11 | andrew.coles | set | files:
+ state_registry_low_memory.tar.gz messages:
+ msg3882 |
2014-10-27 11:20:12 | florian | set | messages:
+ msg3881 |
2014-10-27 11:18:00 | malte | set | messages:
+ msg3880 |
2014-10-27 11:17:39 | florian | set | messages:
+ msg3879 |
2014-10-27 00:45:25 | andrew.coles | set | files:
+ chunk_allocator.h nosy:
+ andrew.coles messages:
+ msg3878 |
2014-10-08 21:30:42 | jendrik | set | nosy:
+ jendrik messages:
+ msg3730 |
2014-10-08 20:37:02 | florian | set | messages:
+ msg3727 |
2014-10-08 12:10:23 | malte | set | messages:
+ msg3714 |
2014-10-08 09:04:47 | florian | set | messages:
+ msg3702 |
2014-10-06 12:20:19 | malte | set | messages:
+ msg3669 |
2014-10-06 12:16:55 | florian | set | messages:
+ msg3668 |
2014-10-04 19:42:03 | malte | set | nosy:
+ florian messages:
+ msg3636 |
2011-08-13 20:17:45 | malte | set | messages:
+ msg1608 |
2011-08-13 19:12:29 | malte | set | messages:
+ msg1607 |
2011-01-23 11:30:08 | malte | set | title: Inefficient Memory Use of Landmarks -> inefficient memory use with -m64 setting |
2011-01-23 11:29:53 | malte | set | messages:
+ msg1220 |
2011-01-23 11:11:44 | erez | create | |