Issue 705: Reduce memory usage of successor generator

Title	Reduce memory usage of successor generator
Priority	wish	Status	resolved
Superseder		Nosy List	florian, haz, jendrik, malte
Assigned To	florian	Keywords
Optional summary

Created on 2017-01-11.10:41:50 by florian, last changed by malte.

Files
File name	Uploaded	Type	Edit	Remove
main.cc	florian, 2017-04-27.19:50:30	text/x-c++src
successor-generator-statistics.patch	malte, 2017-01-11.18:48:48	text/x-patch

Messages
msg6516 (view)	Author: malte	Date: 2017-09-03.18:01:05
> I never went into to much detail because the implementation might be different > in another version and I thought that a rough estimate would be sufficient for > this issue. The formula in the code looks as detailed as it could possibly be. :-) (That's why I only quoted a fragment here.) The main point I overlooked before my previous comment is that the formula in the code is already parametric (for unordered_map<K,V>). I didn't realize this at first because this issue only talks about unordered_set<int>, not about unordered_map or different key types than int. I would still like to polish the estimation code a bit and add a method for unordered_set (we currently only have unordered_map), but I think I know how to proceed now. Thanks, Florian!
msg6515 (view)	Author: florian	Date: 2017-09-03.17:34:12
The formulas in msg6278 are mostly based on guesses that try to make sense of the measured data from the program attached to this issue (main.cc). I think I ran the program with 32bit/64bit and looked at the differences back then. I also remember looking at the implementation on my laptop back then. The comments in the implementation explained the implementation strategy and helped me come up with a guess for the formula. I never went into to much detail because the implementation might be different in another version and I thought that a rough estimate would be sufficient for this issue.
msg6495 (view)	Author: malte	Date: 2017-09-01.16:53:31
Not sure if this is the best place to put this information, but it may be better than putting it nowhere. Once we have resolved these questions, perhaps this would be useful information for the ForDevelopers section of the homepage. I've wanted to generalize Florian's information on the memory requirements of unordered_set<int> from msg6278 to unordered_set<T>, but it's not fully clear to me how to generalize it. Let's look at the non-constant parts of Florian's info on unordered_set<int>: M2(i + p) [entries] Bp [buckets] where M: number of set elements B: number of buckets (roughly in the range [M, 2M]; more on this in msg6278) i: size of an integer p: size of a pointer I'm wondering about the M2*(i+p) part and how to generalize it to non-int sets. The formula implies that we store two pointers and two ints for every entry. I assume one of the two ints is the actual payload and the other one is the hash value. Is this correct? In that case, I think it would be clearer to say that this is one int and one size_t, as these have different sizes in 64-bit builds? (But perhaps despite the 4-byte ints in a 64-bit build, we'd get 32 bytes per entry here rather than 28 due to alignment?) Are the two pointers a forward and backward pointer in a linked list? This surprises me slightly as the description of the implementation I found mentioned that this should be a singly linked list in gcc, but this may be out of date (http://bannalia.blogspot.de/2013/10/implementation-of-c-unordered.html).
msg6457 (view)	Author: jendrik	Date: 2017-07-17.09:43:57
> Jendrik, will this need any follow-up work for the hash functions that are > used in some of the successor generator nodes? I don't think so. We plan to keep using the built-in hash functions for hashing primitive types and as far as I can see the successor generator code only hashes ints.
msg6452 (view)	Author: malte	Date: 2017-07-15.14:51:12
No separate report needed for this issue. I'd still be interested in seeing these numbers for the current default, but we can do that over at issue732. Thanks!
msg6450 (view)	Author: florian	Date: 2017-07-15.08:34:30
I didn't get to this on Friday and probably won't have time on the weekend. Do you still need the numbers from msg6444? It looks like you already did what you wanted to do with them. I'll run an experiment for issue732 when I get back and measure time in there. Is that enough or should I also generate the report for this issue branch?
msg6447 (view)	Author: malte	Date: 2017-07-14.15:37:40
The new issue is issue732. Florian, I've added you to the nosy list, hoping that you could perhaps help me with the experiments and/or do a code review once the code is ready to be reviewed. If you would rather not be involved, just let me know. Of course everybody else is also welcome to join the discussion at issue732.
msg6445 (view)	Author: malte	Date: 2017-07-14.15:34:49
I've played around a bit with the algorithm for creating the successor generators, and the initial results look quite promising. I tested on the last tetris-sat14 task, since the scatter plot suggested this might be a challenging domain for our current algorithm. The experimental new implementation creates the successor generator in 0.06 seconds, more than 100 times faster than our current code (8.52 seconds). Without more experiments, I'm not convinced that the algorithm is implemented correctly, but at least every operator seems to show up in the resulting successor generator exactly once, which is a good property to have. :-) I'll create a new issue for this.
msg6444 (view)	Author: malte	Date: 2017-07-14.11:44:39
Hi Florian, can you let me know which task is the one where successor generator creation takes the longest with the new code? Even better if you have a report -- it looks like the latest report doesn't include data on successor generator creation time and peak memory increase.
msg6443 (view)	Author: florian	Date: 2017-07-13.19:27:42
Done. Jendrik, will this need any follow-up work for the hash functions that are used in some of the successor generator nodes? I saw that issue693 was merged in the meantime.
msg6441 (view)	Author: malte	Date: 2017-07-10.18:00:47
OK, feel free to merge.
msg6440 (view)	Author: florian	Date: 2017-07-10.17:45:00
> The satisficing results don't look that good, and there are some very odd data > points, for example nomystery-sat11-strips:p06.pddl, where the number of > evaluations increases from 276 to 6326008. Any ideas? Or is this a > non-reproducible glitch? I think this is a glitch. There is a difference in 24 instances. I tried some of them locally and got very different results. Sometimes the local numbers matched the grid, sometimes the local master version matched the issue branch on the grid, sometimes both were different, ... I would suspect that this is an effect of one of the bugs in the landmark code.
msg6437 (view)	Author: malte	Date: 2017-07-10.16:51:36
Code looks good; I made one more comment.
msg6436 (view)	Author: malte	Date: 2017-07-10.16:48:28
The satisficing results don't look that good, and there are some very odd data points, for example nomystery-sat11-strips:p06.pddl, where the number of evaluations increases from 276 to 6326008. Any ideas? Or is this a non-reproducible glitch?
msg6435 (view)	Author: florian	Date: 2017-07-10.15:14:32
The new results still look good. This revision includes the fixes to your comments up to Friday but your last comments probably won't influence the results much (I fixed them today). Sorry about all the unexplained errors. I ran LM-cut, ipdb, cegar and lama-first on all tasks from optimal and satisficing suites so the optimal configurations crashed on tasks with axioms or conditional effects. There were no other unexplained errors. http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-v8-issue705-base-issue705-v12-compare.html http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-search_time-base-v12.png
msg6422 (view)	Author: malte	Date: 2017-07-08.15:34:11
Done with my bitbucket comments.
msg6421 (view)	Author: malte	Date: 2017-07-08.15:01:46
If I recall correctly, the issue is that a variable is called interesting already if there are any immediate operators, but these are not related to the variable. It also seemed to me at first glance as if the big-O complexity of the algorithm we're using is not as good as it could be. I suggest adding a TODO there, such as: "TODO: Look into more efficient algorithms that don't iterate over all operators in the subtree at each layer of recursion. It also seems that we currently consider variables "interesting" as soon as there are immediately applicable operators, even if no preconditions on the variable exist. This could be improved." Depending on how soon we can merge this issue, I might have a stab at this. (This weekend I might have some time; later on perhaps not.)
msg6417 (view)	Author: florian	Date: 2017-07-07.18:29:55
Thanks for the review. I made some changes and made one more attempt to use unique_ptr in the classes. This time it worked! I don't know what went wrong the last time. I hope its not an OS specific thing. I remember we had some discussion about the semantics of var_is_interesting and immediate operators but cannot recall what I should change.
msg6408 (view)	Author: malte	Date: 2017-07-05.17:49:00
As discussed offline, I'm done with my code review. The two main things I remember are the "cout << endl" line and the use of "override". It would be nice to add a few comments: one explaining why we use raw pointers, another mentioning that the construction algorithm and its handling of immediate operators could perhaps be improved a bit. But I don't think we need to do this improvement in this issue; we should not keep this up further. I'd be happy to take a stab at this algorithm if I can find time, also to make it more efficient, but again we should not let this keep up this issue.
msg6405 (view)	Author: florian	Date: 2017-07-04.19:53:24
Here is the full report: http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-v7-issue705-base-issue705-v11-compare.html I'll run more experiments as you suggested.
msg6403 (view)	Author: malte	Date: 2017-07-04.19:33:03
I've started looking at the code. For the final version, it would be interesting to have some more experimental results, for example for lama-first (on the satisficing suite), A* + lmcut and A* + cegar or A* + ipdb to see if we see coverage changes somewhere. I assume for A* + blind we see no effect because the timeout doesn't matter much there (because it tends to run out of memory), and the tasks where we get substantial memory savings for the successor generator itself are too large for blind search.
msg6402 (view)	Author: malte	Date: 2017-07-04.19:27:40
Do you have a lab report of base vs. v11 (other than the scatter plot)?
msg6315 (view)	Author: florian	Date: 2017-05-02.14:18:07
Here are the results of the final experiment: (v9) merged in the current default branch including issue688 which uses operator IDs instead of OperatorProxy in the successor generator. This reduces the required memory because OperatorProxy contains an additional pointer to the task. http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-sg_peak_mem_diff_per_task_size-v8-v9.png (v10) then also uses variable IDs instead of variable proxies for variables in switches which reduces memory usage for the same reason but to a smaller degree because switch variables are only responsible for a small fraction of memory usage http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-sg_peak_mem_diff_per_task_size-v9-v10.png (v11) is the revision in the pull request. I'm not aware of any optimization between v10 and v11. I only moved things around a bit to clean up the code (split large functions into smaller ones, uncrustify, etc.) Still, the results show some performance improvement between the versions. I'm not sure where this is coming from, but I'm willing to accept that gift horse. http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-sg_peak_mem_diff_per_task_size-v10-v11.png Over all, here is the comparison of the base version compared to v11: Memory per task object reduces significantly, specifically in the large cases: http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-sg_peak_mem_diff_per_task_size-base-v11.png Search time decreases on average (in some domains by a lot) although there is no difference in coverage: http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-search_time-base-v11.png Construction time also decreases significantly in most cases. In Tidybot the time increases somewhat (the two extreme cases are 4.4s to 6.7s and 3.3s to 5.1s) but in general, the effect is clearly positive. http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-sg_construction_time-base-v11.png
msg6279 (view)	Author: florian	Date: 2017-04-28.18:58:26
I think this is ready for review now. There is more that could be done but I don't think it is necessary until we find a case where the option parser is a greater bottleneck. I added some comments on possible optimizations to the already existing list in the file. The pull request is on bitbucket, the experiments for the final version are running: https://bitbucket.org/FlorianPommerening/downward-issues/pull-requests/34
msg6278 (view)	Author: florian	Date: 2017-04-28.18:52:58
The estimation in the last post was completely off because I used a 64bit compile and assumed 32bit pointers. The updated formulas make more sense. Say p is the size of a pointer and i is the size of an int. 1) Vectors: 3p [sizeof(vector)] + Ni [entries] + 2p [overhead] Again, there is some alignment issue that causes vector sizes to increase in steps of 2 (32bit) or 4 (64 bit). 2) Hash maps: 7p [sizeof(unordered_set)] + M2(i + p) [entries] + Bp [buckets] + 2p [overhead] The number of buckets B can be predicted quite well with the small number of entries that we are looking at. Its B = 3 * 2^n - 1 for the smallest value of n such that B > M. The size per entry 2*(i + p) is what the data structure requires but in some combinations of word width and entry size the maps use more memory per entry. I also suspect alignment issues here but did not investigate further.
msg6254 (view)	Author: florian	Date: 2017-04-27.19:50:30
I attached a small program to measure the memory impact of vectors and hash maps. If you call it as "./main N M" it will create 100000 unordered maps and add M random values from [1..N] and measure the change in peak memory usage. When line 45 is commented out and line 46, it creates vectors with N elements instead. I ran this for N = 1..50 and M = 1..N and got stable results. The overhead of both classes is much higher than I thought: 1) Vectors: 40 + 4N. Actually, the memory usage increases in steps of 16, so it stays constant for 4 consecutive sizes (might be an alignment issue). 2) Hash maps: 96 + 32M + 16B where B is an offset that gets bigger with M at certain stages. The stages start out as 32^n - 1 (2, 5, 11, 23, 47) but then the gaps are slightly larger. The values of B are close to M at the start of the stage (1, 4, 10, 22, 47). The next revision adds these formulas as size estimations for the switches. When creating a switch, we estimate both sizes and use the class with the lower estimate. Since values of B are more relevant for N and M below 50, I added explicit values for them, and use an estimation of 32^n - 1 for higher values. The memory usage unfortunately increases with this patch: http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-sg_peak_mem_diff_per_task_size-v7-v8.png I assume that is because the estimation is off. I used int vectors, and the operator proxies we currently use store an additional pointer to the task. I'll merge the default branch with issue688 and repeat this experiment.
msg6241 (view)	Author: florian	Date: 2017-04-27.13:25:11
As we discussed offline, I added another specialized class for a leaf node that only adds a single operator. As before, this improves our metric (estimated bytes per task object): http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-sg_peak_mem_diff_per_task_size-v6-v7.png The effect on overall search time and memory usage was mostly smaller than the noise. The next step is to look for better rules of when to use which switch node.
msg6239 (view)	Author: florian	Date: 2017-04-27.11:46:46
I added two more specialized classes and a missing vector.reserve(): (v4) just adds the call to reserve and a bunch of statistics output. Compared to (v3) memory usage actually increases in some cases (mostly transport and scanalyzer and mostly between 100% and 120% of the original value). Not sure what's going on here. http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-v3-v4.png (v5) uses an unordered_map instead of a vector for switch nodes when less than 50% of vector elements are non-zero. Compared to (v4) this decreases memory usage on the high end of the curve (cases where we need more bits per task object) and has some overhead on the low end. Overall it looks quite good, but we could experiment with more sophisticated criteria for switching to the hash-based switch node: http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-v4-v5.png (v6) After (v5) a lot of wasted memory came from hash maps with just one entry, so I added a specialized switch node that just does a single check. This had a clearly positive result. http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-v5-v6.png I also ran blind search on revisions (v3), (v5), and (v6). I uploaded the plots that compare the base to (v6). Let me know if you are interested in other results as well. Runtime decreases in general but not enough to lead to increased coverage. Memory usage is only dominated by the successor generator in very small cases, so there is little effect. Successor generator construction time also can reduce quite a bit but again, this usually doesn't influence overall runtime. http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-total_time-base-v6.png http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-search_time-base-v6.png http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-memory-base-v6.png http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-sg_construction_time-base-v6.png
msg6228 (view)	Author: malte	Date: 2017-04-26.15:00:38
Sounds good! We should also evaluate how efficiently we generate successors, i.e., run a search. Only running a blind search should be enough, but if you want, you can also add astar(lmcut()) to have a bit more data. I don't think we need this for all revisions, but I think it would be good to have the data for the baseline vs. v3, and then also for the following revisions.
msg6227 (view)	Author: florian	Date: 2017-04-26.14:56:41
I started working on this issue. As a first step, we wanted to clean up the interface a bit. Currently, the "switch" node performs three different tasks: 1) add immediately applicable operators 2) fork the search into the "don't care successor" and the "value successor" 3) branch over values of a variable As a first step, we split the switch class into three classes for those three tasks. The following plots compare the base revision against the revision after some changes. We measure the change M in peak memory caused by creating the successor generator (approximating the actual memory usage) and the task size N reported by the translator (approximating the task encoding size). The x-axes of the plots show the ratio M/N, which approximates the number of bytes per object in the task encoding. The y-axes show by how much this ratio changed in the new version. (v1) After splitting off functionality 1), memory usage decreases in some cases because we no longer have the overhead of the vector to store immediate operators when they are not needed: http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-base-v1.png (v2) After additionally splitting off functionality 2), memory usage increases again (in some cases above the original size) because we now have additional objects to fork the search. This can also help in situations where we previously had a switch without a don't care successor but this doesn't occur often. So this change makes things worse but it cleans up the interface: http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-base-v2.png (v3) After splitting the "switch" nodes into three parts, I removed the "empty" node that was one of the main reasons for the memory usage in large tasks. In most node types, we can assert that successors are not empty (otherwise, we don't create the node). Only when branching over values, a successor can be empty. The new code uses nullptr in this situation and adds a check. The memory usage clearly improved and also mostly offset the loss from (v2): http://ai.cs.unibas.ch/_tmp_files/pommeren/issue705-base-v3.png The next step is to add a hash-based switch node for cases where the switch node over values is only sparsely populated.
msg6085 (view)	Author: malte	Date: 2017-01-11.20:17:25
> In the meantime, I'd be interested to know if the non-determinism discovered here > (set for landmarks?) was the cause for altered expansions/evaluations when we tried > optimizing the successor generator construction (issue556). Perhaps not, as blind > search was giving non-uniform results, but maybe something has changed since then > that has inadvertently fixed the issue. No, that issue only affects landmark-based configurations (and as far as I understand, only ones using reasonable orders, so not the optimal ones). In issue557, we already saw changes in expansions/evaluations in A* with blind search. To me the most likely explanation is that the code in issue556 itself was buggy because A* with blind search should really be stable in terms of generated/expanded/evaluated nodes with a fixed operator ordering. But I don't think anyone looked into this further.
msg6084 (view)	Author: haz	Date: 2017-01-11.20:09:51
I s'pose. Then again I trust my back-of-the-envelope math far less than you guys might. It's the quick statistics that I crave from the memcache outputs -- how many internal/leaf nodes in the successor generator, how big is each, what do the counts look like with the other metrics (op numbers, precond numbers, etc). All could be lifted with the right code alterations, but I've frequently found things I never thought to consider when just browsing the memcache stats. If 662 isn't picked up in the next couple of weeks, I'll have a crack at finding a work-around when things ease up a bit commitment-wise. In the meantime, I'd be interested to know if the non-determinism discovered here (set for landmarks?) was the cause for altered expansions/evaluations when we tried optimizing the successor generator construction (issue556). Perhaps not, as blind search was giving non-uniform results, but maybe something has changed since then that has inadvertently fixed the issue.
msg6079 (view)	Author: malte	Date: 2017-01-11.19:14:37
Disagree in this case. :-) It's not difficult to understand the memory complexity of the successor generator, and the inaccuracies of a back-of-the-envelope estimate (e.g. whether typical dynamic allocation overhead is 8 bytes, 12 bytes or whatever) don't really matter much in understanding what is wrong and what would be a better solution. Profiling will tell you what happens in a particular task, but that's no replacement for understanding what happens in general because the successor generator can have so many different extreme cases (we have seen tasks with 30000 variables; tasks with a few variables but ranges around 1000; tasks with a million operators; tasks where one operator has hundreds of thousands of conditional effects -- ok, these don't actually matter for the successor generator). But of course memory profiling can be very useful in general, and if someone wants to work on issue662, this would be great. Enthusiasm for cmake-related tasks is often a bit low around here, though. :-) Ideally, I think one should both profile and try to understand the data structure intellectually. When Jendrik and I looked at the state registry hash table implementation recently, I felt that the combination of the two was very insightful and critical for designing something better, much more so than only doing one of the two. But here with the successor generator the data structures are so simple that a profiler is not actually that useful.
msg6078 (view)	Author: haz	Date: 2017-01-11.19:00:44
Would be so much easier with memcache in hand (ala issue662).
msg6074 (view)	Author: malte	Date: 2017-01-11.18:48:48
I'm attaching a patch that provides the statistics in msg6064. Note that the output is based on 32-bit assumptions, as it uses hard-coded numbers like "4" to estimate the size of a pointer. The patch applies cleanly to revision 609499afc23a, which as of this writing is an unmerged revision on the issue688 branch. Hence it will underestimate the memory usage on a 64-bit system by a factor of roughly 2. The patch does not apply cleanly to default, but if someone wants to work on this, it should be easy to port because it merely adds a bunch of methods (plus some statistical output at the start and end of successor generator construction). It may be easier to wait until issue688 is merged, though, to avoid a merge conflict with this issue as both touch the successor generator. One aspect I didn't mention in msg6064 is that it is wasteful in data structures like these to use "VariableProxy" instead of a variable ID. In the example, this causes us to store 13847 pointers to the same abstract task in the successor generator. For really large tasks with more complex preconditions than Logistics, this number can easily be on the order of 10^5-10^6.
msg6066 (view)	Author: florian	Date: 2017-01-11.10:42:03
Quoting Malte from msg6064: >> By the way, I'm surprised that a successor generator for logistics98 task #25 >> should take so much memory. There are roughly 40000 operators, and Logistics >> operators should have either 1 or 2 preconditions. The successor generator size >> should scale roughly with the total number of preconditions, so less than 80000 >> "units" in this case, and 100 MiB for this sounds a lot. > > I looked into this. The thing I missed is that every internal node of the > successor generator stores a pointer for every value of the tested variable, > even if there are no operators that have this value as a precondition. This > could be avoided by using a hash map instead of a vector to access the children, > and then we would indeed be linear in the total number of preconditions (plus > the number of actions without preconditions), but of course using hash maps has > other disadvantages. > > A good compromise could be an "adaptive" implementation that uses vectors when > they are densely populated and something else otherwise. This could fit into the > current framework easily because we already have polymorphic node types. This > could be taken further by e.g. having dedicated node types that just add an > operator or set of operators to the result set and then chain to another node, > which would let us avoid lots of empty or tiny vectors for "immediate_operators" > and friends. > > With the current implementation we can be a factor of "max variable domain size" > worse than the big-O memory usage we would like, and Logistics is one of the > domains where variable domains can get very large. There are 29 variables with a > domain size of 13 (representing trucks), 13 variables with a domain size of 27 > (representing planes), and 18 variables with a domain size of 393 (representing > packages). > > Creating the successor generator indeed increases peak memory usage by ~100 MiB, > so this is roughly the amount of space that is actually used by the successor > generator including memory management overhead. ("Roughly" because just > measuring peak memory increase is a somewhat dangerous way to measure this, > especially if a lot of memory has been released recently.) > > I added some code with some rough calculations to measure how much memory is > used for what. My numbers only add up to 84 MB, so are somewhat inaccurate, > which I assume is because I didn't take into account memory padding or vector > overallocation (although I hope we use reserve to avoid the latter), and I > estimated the memory management overhead somewhat optimistically as 8 bytes per > call to "new". > > For tasks like these, empty successor generators and pointers to them are > responsible for the bulk of the memory usage. With 64-bits, this will get worse, > as all pointers are double the size. If we're unlucky, memory usage will almost > double with a 64-bit compile. Perhaps it's not quite as bad as that because > memory management overhead doesn't double. [Addendum: I have now tried it out > with a 64-bit compile, and memory usage indeed roughly doubles to ~206 MB for > each successor generator.] > > Here are some numbers I gathered for this Logistics task (32-bit mode). This is > for the second successor generator created, but all numbers other than the peak > memory statistics in the first two lines are identical for the first one. > > MALTE: Constructing successor generator... [t=1.0473s, 138584 KB] > MALTE: Done constructing successor generator... [t=1.45588s, 241480 KB] > MALTE: SG size estimates: total: 83767368 > MALTE: SG size estimates: object overhead: 62139348 > MALTE: SG size estimates: operators: 527192 > MALTE: SG size estimates: switch var: 110776 > MALTE: SG size estimates: generator for value: 20934664 > MALTE: SG size estimates: default generator: 55388 > MALTE: SG object counts: switches: 13847 > MALTE: SG object counts: leaves: 26208 > MALTE: SG object counts: empty: 5138224 > > If the estimates are accurate, then roughly 3/4 of the overall memory is used > for empty successor generators. This could be cut down to 0 by using nullptr > instead of millions of identical empty generators, or to a small constant number > of bytes by reusing the same empty generator, similar to the reuse of nodes in > the look-up trees for Cartesian abstractions. Of the remaining 1/4 of memory > used, more than 90% is taken up by the "generator for value" vectors. By storing > this information more cleverly and perhaps also being a bit cleverer about > memory allocation (we could reduce overhead to close to zero by using an arena > allocator), it should be possible to reduce overall successor generator memory > usage by roughly 90% for tasks like this one.
msg6065 (view)	Author: florian	Date: 2017-01-11.10:41:50
The successor generator currently wastes a lot of memory. In some logistic tasks it requires hundreds of MB. We want to reduce the memory overhead in this issue.

History
Date	User	Action	Args
2017-09-03 18:01:05	malte	set	messages: + msg6516
2017-09-03 17:34:12	florian	set	messages: + msg6515
2017-09-01 16:53:31	malte	set	messages: + msg6495
2017-07-17 09:43:57	jendrik	set	messages: + msg6457
2017-07-15 14:51:12	malte	set	messages: + msg6452
2017-07-15 08:34:30	florian	set	messages: + msg6450
2017-07-14 15:37:40	malte	set	messages: + msg6447
2017-07-14 15:34:49	malte	set	messages: + msg6445
2017-07-14 11:44:39	malte	set	messages: + msg6444
2017-07-13 19:27:42	florian	set	status: reviewing -> resolved messages: + msg6443
2017-07-10 18:00:47	malte	set	messages: + msg6441
2017-07-10 17:45:00	florian	set	messages: + msg6440
2017-07-10 16:51:36	malte	set	messages: + msg6437
2017-07-10 16:48:28	malte	set	messages: + msg6436
2017-07-10 15:14:32	florian	set	messages: + msg6435
2017-07-08 15:34:11	malte	set	messages: + msg6422
2017-07-08 15:01:46	malte	set	messages: + msg6421
2017-07-07 18:29:55	florian	set	messages: + msg6417
2017-07-05 17:49:00	malte	set	messages: + msg6408
2017-07-04 19:53:24	florian	set	messages: + msg6405
2017-07-04 19:33:03	malte	set	messages: + msg6403
2017-07-04 19:27:40	malte	set	messages: + msg6402
2017-05-02 14:18:07	florian	set	messages: + msg6315
2017-04-28 18:58:26	florian	set	status: chatting -> reviewing messages: + msg6279
2017-04-28 18:52:58	florian	set	messages: + msg6278
2017-04-27 19:50:30	florian	set	files: + main.cc messages: + msg6254
2017-04-27 13:25:11	florian	set	messages: + msg6241
2017-04-27 11:46:46	florian	set	messages: + msg6239
2017-04-26 15:00:38	malte	set	messages: + msg6228
2017-04-26 14:56:41	florian	set	assignedto: florian messages: + msg6227
2017-01-11 20:17:25	malte	set	messages: + msg6085
2017-01-11 20:09:51	haz	set	messages: + msg6084
2017-01-11 19:14:37	malte	set	messages: + msg6079
2017-01-11 19:00:44	haz	set	messages: + msg6078
2017-01-11 18:48:48	malte	set	files: + successor-generator-statistics.patch messages: + msg6074
2017-01-11 12:21:59	haz	set	nosy: + haz
2017-01-11 10:42:03	florian	set	status: unread -> chatting messages: + msg6066
2017-01-11 10:41:50	florian	create

Issue705