Title new hash table implementation
Priority feature Status reviewing
Superseder Nosy List jendrik, malte
Assigned To jendrik Keywords
Optional summary

Created on 2016-12-14.15:40:43 by jendrik, last changed by malte.

msg6921 (view) Author: malte Date: 2018-03-16.18:12:51
Thanks! Nothing concerning about the load factors.
msg6917 (view) Author: jendrik Date: 2018-03-16.17:25:52
I'll copy your observation about the airport task to issue537, which is concerned 
with making blind search faster again.

The report now aggregates the load factor using the arithmetic mean.
msg6916 (view) Author: malte Date: 2018-03-16.16:58:02
Interesting. I'm surprised to see the timeouts in airport, so I ran Airport #24
on my desktop at home (which is rather crappy, as it is designed not to require
a fan). It ran out of memory in 1000 seconds, which would mean that it is at
least roughly a factor of 2 faster, but I guess this is feasible, as I wasn't
using all cores concurrently and when we last compared desktop and server
machines, the single-core performance of the desktops was much better. Still,
I'm a bit surprised to see memory not being a bottleneck in blind search, and it
would be interesting to do a profile on such a problem at some point. But of
course this is not what this issue is about.

Regarding this issue, it's hard to analyze the load factor from the report
without going through all instances individually due to the way it is
aggregated. Would it be possible to give averages rather than sums?

Otherwise I guess all that is left is reading the comment.
msg6912 (view) Author: jendrik Date: 2018-03-16.16:06:00
I don't see any other change between v2 and v3 apart from the bit-twiddling which 
I'd expect to be responsible for the increase in speed.
msg6911 (view) Author: jendrik Date: 2018-03-16.15:56:45
The experiment uses no timeout for the translator (it uses an older version of Lab).

Here is the diff:

Unfortunately, Bitbucket does not do a good job of formatting it.
msg6904 (view) Author: malte Date: 2018-03-16.15:21:41
Nice to see these improvements. :-) Also nice to see how close 32 and 64 bit are
now (at least for blind search) in terms of memory score.

I'm somewhat surprised to see so many timeouts in airport. Does this use
separate translator and search timeouts, or does it use an overall timeout?

I would still like to have a look at the code comment, but I don't think I'll
have time today. Feel free to remind me tomorrow.

Can you send a link to a diff between v2 and v3? I'm curious if the time
improvements are solely due to the modulo to bit-twiddling change, or if there
are other significant code differences.
msg6893 (view) Author: jendrik Date: 2018-03-16.14:35:45
Here are the experiment results (they were run on the new grid):

Unexplained errors: 
- 8 parcprinter tasks write more than the soft limit to stdout. We already know about this problem.
- 2 scanalyzer tasks and 1 satellite task can't be translated since they need too much memory. Since we run 4 
algorithms, this explains the 12 "slurmstepd: error: Exceeded step memory limit at some point." messages. We 
should investigate why the translator needs more memory on the new grid than on the old grid. The first step, 
however, should be to limit the memory for the translator, which I just implemented in downward-lab.

Coverage stays the same, total_time decreases and memory increases slightly.
msg6868 (view) Author: malte Date: 2018-03-15.17:04:35
The only thing I didn't do yet is read the main comment, which I want to do
later. Because we now do the bit-twiddling, we probably need to run a second
round of experiments? If yes, feel free to run them now. My possible future
comments on the comment shouldn't lead to another round of experiments. ;-)
msg6855 (view) Author: jendrik Date: 2018-03-15.12:15:55
I'm done with the changes we discussed. Could you please have a look again?
msg6836 (view) Author: malte Date: 2018-03-14.18:44:17
Done with the code review (mostly offline).
msg6458 (view) Author: jendrik Date: 2017-07-17.13:09:08
Here are the results of the experiment:

I think the memory, time and coverage scores look good. One important parameter is the maximum 
distance between an entry's ideal and actual bucket. In the experiment above the maximum distance 
is 32. I also ran an experiment where the maximum distance is 64:

While the memory scores are slightly higher than for maximum_distance=32, the time scores drop. 
Coverage scores are almost unaffected.

I propose to stick with maximum_distance=32 for now.
msg6426 (view) Author: jendrik Date: 2017-07-09.14:50:58
I copied the code for the new hash table from issue213, dusted it off and prepared a new 
pull request at . 
Experiments for blind() and lmcut() are running. I'll add this to the review queue.
msg5880 (view) Author: jendrik Date: 2016-12-14.15:40:43
Split off from issue213.
Date User Action Args
2018-03-16 18:12:51maltesetmessages: + msg6921
2018-03-16 17:25:52jendriksetmessages: + msg6917
2018-03-16 16:58:02maltesetmessages: + msg6916
2018-03-16 16:06:00jendriksetmessages: + msg6912
2018-03-16 15:56:45jendriksetmessages: + msg6911
2018-03-16 15:21:41maltesetmessages: + msg6904
2018-03-16 14:35:45jendriksetmessages: + msg6893
2018-03-15 17:04:35maltesetmessages: + msg6868
2018-03-15 12:15:55jendriksetmessages: + msg6855
2018-03-14 18:44:17maltesetmessages: + msg6836
2017-07-17 13:09:08jendriksetmessages: + msg6458
summary: Status: wait for issue693 to be merged ->
2017-07-09 14:50:58jendriksetstatus: unread -> reviewing
messages: + msg6426
2017-04-27 23:25:05jendriksetsummary: Status: wait for issue693 to be merged
2016-12-14 15:40:43jendrikcreate