Created on 2011-04-11.12:11:36 by jendrik, last changed by jendrik.
File name |
Uploaded |
Type |
Edit |
Remove |
test233.sh
|
malte,
2011-08-05.22:27:32
|
text/x-sh |
|
|
msg1682 (view) |
Author: jendrik |
Date: 2011-08-25.14:39:27 |
|
The scripts have been adapted. The output files are now only scanned if the
values are not found in the logs.
|
msg1637 (view) |
Author: jendrik |
Date: 2011-08-14.23:45:02 |
|
I opted for setting it to None whenever there are multiple values for
initial_h_value.
|
msg1633 (view) |
Author: malte |
Date: 2011-08-14.21:38:01 |
|
Like in so many cases, the planner is flexible enough here to make a result
analysis that captures all relevant possibilities very tough. But I think in all
cases we've actually run so far, in a setting where we use only one heuristic
and multiple searches, the initial h value will always be the same.
So one thing we could do is check if all list entries are the same, and if that
is the case, keep that value as the initial_h_value. If not all list entries are
the same, we could set it to None. But I'm not sure how robust that is.
I'm also fine with setting it to None whenever we have more than one initial_h line.
|
msg1632 (view) |
Author: jendrik |
Date: 2011-08-14.21:30:03 |
|
Thanks for the clarification. I read "iterated" instead of
"multi-heuristic".
For iterated search, can we deduce a run's single initial_h_value from
the list of initial_h_values?
|
msg1630 (view) |
Author: malte |
Date: 2011-08-14.21:01:51 |
|
> BTW: What about lines like "Initial state h value: 1147184/1703241."?
> Which value should be taken here?
That's what I meant with this comment:
For now, I would suggest setting it to None or some other value that signals
"not well-defined" if there is more than one heuristic value given in the
"initial h" line. So we don't parse initial h values for multi-heuristic
search.
(Just to be clear, this was *not* directed at iterated search, but at search
that uses multiple heuristics. It might be possible to come up with a reasonable
definition of "initial_h" for iterated search that only uses a single heuristic.
But I don't think it's necessary right now.)
|
msg1623 (view) |
Author: jendrik |
Date: 2011-08-14.20:20:04 |
|
Initial h values are now parsed for single searches and are effectively
None for iterative searches.
For iterated searches we parse lists of values for the attributes
expansions, evaluations, dead ends, generated states, initial_h_value,
plan_length, cost and search_time. Those are not included in the reports
yet.
We set cost=min(list_of_costs) and plan_length=min(list_of_plan_lengths)
after parsing.
BTW: What about lines like "Initial state h value: 1147184/1703241."?
Which value should be taken here?
|
msg1617 (view) |
Author: malte |
Date: 2011-08-14.12:46:50 |
|
> Why is there a line "Initial state h value: ." in the cumulative results?
Because the cumulative results used the same code as the individual result dumps
for this (SearchProgress::print_statistics()), and this did not check if there
are any initial h values to be dumped.
> Shouldn't this either be given a value or left out?
Yes. This is now fixed. (The line is no longer included.)
> Or maybe the question is: What should be the initial_h_value of a run then?
> (The list of h_values is already parsed, just not included in the reports yet)
For now, I would suggest setting it to None or some other value that signals
"not well-defined" if there is more than one heuristic value given in the
"initial h" line. So we don't parse initial h values for multi-heuristic search.
Regarding iterated search, how do you currently handle other info that is dumped
multiple times? Does it depend on the attribute in question?
|
msg1613 (view) |
Author: jendrik |
Date: 2011-08-14.00:22:04 |
|
Why is there a line "Initial state h value: ." in the cumulative results?
Shouldn't this either be given a value or left out?
Or maybe the question is: What should be the initial_h_value of a run then? (The
list of h_values is already parsed, just not included in the reports yet)
|
msg1487 (view) |
Author: malte |
Date: 2011-08-05.22:27:32 |
|
Attaching the script I used for benchmarking in case it's any use.
Just call without arguments from new-scripts.
It assumes that exp-lama is present, and it will clobber some directory names
that you're unlikely to use as well as file "testme.log".
|
msg1486 (view) |
Author: malte |
Date: 2011-08-05.22:20:42 |
|
I did some tests with the exp-lama data:
Using "return 0" for get_problem_size:
26.082s with a hot cache
41.114s with a cold cache
So it looks like 15s were just spent reading stuff from the hard disk. This was
on my local machine, with a local hard disk. It will probably be worse when
running things on turtur et al. with things mounted via NFS, but I did not test
that.
Commenting out all properties that are parsed from output, output.sas or all.groups:
9.465s with a cold cache
5.448s with a hot cache
So there's a strong case to be made for avoiding parsing these files, by adding
all relevant statistics directly into the translator and preprocessor. So that
is what I suggest we should do.
|
msg1485 (view) |
Author: jendrik |
Date: 2011-08-05.21:30:47 |
|
>I'm just supposed to take the data from
>that place, but run the resultfetcher from a proper checkout of the repository?
Right, I commented out the preprocess functions there. They're back in now so you
can also use this clone for testing if you want to.
|
msg1484 (view) |
Author: malte |
Date: 2011-08-05.21:19:30 |
|
> There is text after "Finding invariants".
Strange, that should be there unconditionally. I must have overlooked that,
which is weird, because it must have been exactly the thing I was looking for.
|
msg1483 (view) |
Author: malte |
Date: 2011-08-05.21:06:04 |
|
Ah, I think I now what I misunderstood. I'm just supposed to take the data from
that place, but run the resultfetcher from a proper checkout of the repository?
The version of the resultfetcher at /home/downward/jendrik/downward/new-scripts
was much faster because "add_preprocess_functions" was commented out. Adding it
back in, the runtime jumps to 72 seconds for me.
|
msg1482 (view) |
Author: malte |
Date: 2011-08-05.20:58:38 |
|
> With the tip revision you can parse the results at
/home/downward/jendrik/downward/new-scripts/exp-lama by doing:
> ./downward-resultfetcher.py path-to-exp-dir
I tried
$ time ./downward-resultfetcher.py --dest XXX-out-of-the-way exp-lama
$ time ./downward-resultfetcher.py --dest YYY-out-of-the-way exp-lama
and the first one took 14 seconds, the second one 11 seconds. CPU time was in
the 10-11 seconds range in both cases, so presumably the difference is due to
cache effects.
That's quite a bit away from the 24 seconds you mention -- did I do anything
wrong? (I was running this on turtur.) I want to make sure I'm measuring more or
less the right thing.
|
msg1468 (view) |
Author: malte |
Date: 2011-08-05.15:14:41 |
|
OK, apart from the problem size, I think these can all be reported easily by the
translator/preprocessor themselves, and maybe they should also report the
problem size. (We can change its definition to be a bit more semantic than
syntactic; the important point is that it captures all aspects of the encoding.)
|
msg1467 (view) |
Author: jendrik |
Date: 2011-08-05.14:00:03 |
|
I see. The code explains it best:
eval.add_function(translator_facts, file='output.sas')
eval.add_function(preprocessor_facts, file='output')
eval.add_function(translator_derived_vars, file='output.sas')
eval.add_function(preprocessor_derived_vars, file='output')
#eval.add_function(cg_arcs, file='output')
eval.add_function(translator_problem_size, file='output.sas')
eval.add_function(preprocessor_problem_size, file='output')
# Total mutex group sizes after translating
# (sum over all numbers following a "group" line in the
"all.groups" file)
eval.add_function(translator_mutex_groups_total_size,
file='all.groups')
and a regex:
eval.add_pattern('translator_mutex_groups',
r'begin_groups\n(\d+)\ngroup', file='all.groups',
type=int, flags='MS')
|
msg1466 (view) |
Author: malte |
Date: 2011-08-05.13:49:24 |
|
>> Can you give a complete list of things currently parsed from the translator
>> output files?
> Here you go:
> [...]
Sorry, by "output files" I meant the files that the translator produces
(output.sas, all.groups, test.groups) as opposed to what it writes on stdout or
stderr. So only the things that are parsed from these three files. (Since one of
our objectives here is to reduce or eliminate the need to parse these files, if
I see it correctly.)
> That's true, I used the fact that all parsing functions are given the results
> of the regex parsing and that each function has access to the results of all
> previously applied parsing functions. Now we only call the expensive
> functions, if those values are not already present.
Wonderful.
|
msg1465 (view) |
Author: jendrik |
Date: 2011-08-05.13:42:28 |
|
> For example, I would find it strange to include the number of facts and number
> and total size of mutex groups, but not include the number of variables, which
> is information that more people would be interested in, I think.
I agree.
>(BTW,
> "invariant groups" should be called "mutex groups" in the output, I think. I was
> confused by that, so I think other people might be, too.)
I have changed that in the parser and translate.py
>
> Can you give a complete list of things currently parsed from the translator
> output files?
Here you go:
['translator_auxiliary_atoms', 'translator_axioms', 'translator_derived_vars', 'translator_effect_conditions_simplified', 'translator_facts',
'translator_final_queue_length', 'translator_implied_effects_removed', 'translator_implied_preconditions_added', 'translator_mutex_groups',
'translator_mutex_groups_total_size', 'translator_operators_removed', 'translator_ops', 'translator_problem_size', 'translator_propositions_removed',
'translator_relevant_atoms', 'translator_time_building_dictionary_for_full_mutex_groups', 'translator_time_building_mutex_information',
'translator_time_building_strips_to_sas_dictionary', 'translator_time_building_translation_key', 'translator_time_checking_invariant_weight',
'translator_time_choosing_groups', 'translator_time_collecting_mutex_groups', 'translator_time_completing_instantiation', 'translator_time_computing_fact_groups',
'translator_time_computing_model', 'translator_time_detecting_unreachable_propositions', 'translator_time_generating_datalog_program',
'translator_time_instantiating', 'translator_time_instantiating_groups', 'translator_time_normalizing_datalog_program', 'translator_time_normalizing_task',
'translator_time_parsing', 'translator_time_preparing_model', 'translator_time_processing_axioms', 'translator_time_simplifying_axioms',
'translator_time_translating_task', 'translator_time_writing_mutex_key', 'translator_time_writing_output', 'translator_time_writing_translation_key',
'translator_total_queue_pushes', 'translator_uncovered_facts', 'translator_vars']
For the preprocessor we have:
['preprocessor_axioms', 'preprocessor_derived_vars', 'preprocessor_facts', 'preprocessor_ops', 'preprocessor_problem_size', 'preprocessor_vars']
> Regarding your code, why did you add the block=True in the invariant generation
> code? It didn't appear to be necessary in the example I ran, but maybe it is on
> some other inputs?
You can reproduce it by doing
$ ./test-scripts.sh
$ less exp-test/runs-00001-00100/00001/run.log
There is text after "Finding invariants".
> Of course, there remains the problem that we might still want to perform
> experiments with older versions of the translator, in which we would
> need some kind of fallback to the old behaviour. Any ideas on that?
That's true, I used the fact that all parsing functions are given the results of the regex parsing and that each function has access to the results of all previously
applied parsing functions. Now we only call the expensive functions, if those values are not already present.
> Can you give
> me access to the data and instructions on how to reproduce your 52 vs. 24 second
> figures? Then I'll try to investigate some alternatives.
With the tip revision you can parse the results at /home/downward/jendrik/downward/new-scripts/exp-lama by doing:
./downward-resultfetcher.py path-to-exp-dir
All the reported times I reported were obtained with a warm cache (somehow I'm not allowed to clear the cache on this machine) and without the expensive functions.
With all functions active the parsing time was 127s before I started optimizing and is now 30s.
|
msg1461 (view) |
Author: malte |
Date: 2011-08-05.10:55:40 |
|
It's not OK to count 100 lines with one token each as "zero".
I'd like to do some tests of my own; how can I reproduce these tests?
These data are for the complete resultfetcher call with a cold cache? What was
the total time before you started your optimizations?
|
msg1459 (view) |
Author: jendrik |
Date: 2011-08-05.03:30:02 |
|
24s: without any length measurement
50s: splitlines and split (old version)
30s: len(content.split())
46s: len(re.compile(r'\s').findall(content))
24s: content.count(' ')
Of course the last one yields other values but is significantly faster.
What do you think?
|
msg1445 (view) |
Author: malte |
Date: 2011-08-04.12:36:19 |
|
I like the general idea, but we should use some principled way of deciding what
exactly to include in the translator output then.
For example, I would find it strange to include the number of facts and number
and total size of mutex groups, but not include the number of variables, which
is information that more people would be interested in, I think. (BTW,
"invariant groups" should be called "mutex groups" in the output, I think. I was
confused by that, so I think other people might be, too.)
Can you give a complete list of things currently parsed from the translator
output files?
Regarding your code, why did you add the block=True in the invariant generation
code? It didn't appear to be necessary in the example I ran, but maybe it is on
some other inputs?
Of course, there remains the problem that we might still want to perform
experiments with older versions of the translator, in which we would
need some kind of fallback to the old behaviour. Any ideas on that?
Regarding the problem size, it would be a pity to lose this
functionality. First of all this function could be simplified to:
return len(content.split()). There is no need to split lines separately. Does
that help with performance? If not, generating the list before counting it might
be the wasteful part; maybe there is an in-place method we can use. Can you give
me access to the data and instructions on how to reproduce your 52 vs. 24 second
figures? Then I'll try to investigate some alternatives.
BTW, when benchmarking things like this, performance can be very different with
hot vs. cold caches. Since analysis is usually run once, cold caches are
probably the best way to test this. Running
$ sync
$ echo 3 > /proc/sys/vm/drop_caches
before the measurement should do the trick, I think.
|
msg1441 (view) |
Author: jendrik |
Date: 2011-08-04.01:19:46 |
|
I have recently intensely profiled the resultfetching code and identified and
fixed many of the culprits that were responsible for the slow performance. One
remaining bottleneck are some functions that parse output, output.sas and
all.groups to find the number of facts, number of derived variables and the
total invariant group size.
In order to be able to parse them from the much shorter run.log file, I have
added three lines of output to the translate.py module. You can find the code in
my digitaldump/downward repository on bitbucket in the issue233 branch. Do you
think this is a good solution? Should I add similar code to the preprocessor?
Apart from that I found that the function
def get_problem_size(content):
"""
Total problem size can be measured as the total number of tokens in the
output.sas/output file.
"""
return sum([len(line.split()) for line in content.splitlines()])
takes very long for the output and output.sas files. You had requested this
functionality eralier, but do you think we can drop it to be able to parse much
faster? Parsing my test experiment takes 53 secs with and 24 secs without this
function.
|
|
Date |
User |
Action |
Args |
2011-08-25 14:39:27 | jendrik | set | status: reviewing -> resolved messages:
+ msg1682 |
2011-08-14 23:45:03 | jendrik | set | messages:
+ msg1637 |
2011-08-14 21:38:01 | malte | set | messages:
+ msg1633 |
2011-08-14 21:30:03 | jendrik | set | messages:
+ msg1632 |
2011-08-14 21:01:51 | malte | set | messages:
+ msg1630 |
2011-08-14 20:20:04 | jendrik | set | messages:
+ msg1623 |
2011-08-14 12:46:50 | malte | set | messages:
+ msg1617 |
2011-08-14 00:22:04 | jendrik | set | messages:
+ msg1613 |
2011-08-05 22:27:33 | malte | set | files:
+ test233.sh messages:
+ msg1487 |
2011-08-05 22:20:43 | malte | set | messages:
+ msg1486 |
2011-08-05 21:30:48 | jendrik | set | messages:
+ msg1485 |
2011-08-05 21:19:30 | malte | set | messages:
+ msg1484 |
2011-08-05 21:06:04 | malte | set | messages:
+ msg1483 |
2011-08-05 20:58:38 | malte | set | messages:
+ msg1482 |
2011-08-05 15:14:41 | malte | set | messages:
+ msg1468 |
2011-08-05 14:00:03 | jendrik | set | messages:
+ msg1467 |
2011-08-05 13:49:24 | malte | set | messages:
+ msg1466 |
2011-08-05 13:42:28 | jendrik | set | messages:
+ msg1465 |
2011-08-05 10:55:41 | malte | set | messages:
+ msg1461 |
2011-08-05 03:30:02 | jendrik | set | messages:
+ msg1459 |
2011-08-04 12:36:19 | malte | set | messages:
+ msg1445 |
2011-08-04 01:22:06 | jendrik | set | status: chatting -> reviewing |
2011-08-04 01:19:46 | jendrik | set | status: unread -> chatting messages:
+ msg1441 |
2011-04-11 12:11:36 | jendrik | create | |
|