Salomé: you're right, I should have reacquainted myself better with the code.
:-) The second for loop needs a deterministic iteration order for the algorithm
to be deterministic.
For what it's worth, Satellite #33 has roughly 1 million unary operators (of
which a whopping 20 are pruned).
Let's look at the two candidate algorithms we've considered:
A) Use an unordered_map to perform the pruning, then sort the unary operators at
the end.
B) Use a map to perform the pruning which uses a comparison function that
guarantee a canonical order.
I expect that algorithm A) is much faster than B). Let's simplify a bit and
assume that we have reasonable hash functions and that the "small steps" of the
algorithm such as comparing keys are O(1). This is not quite true, but it's true
if we assume a fixed upper bound on the number of preconditions of a unary
operators, which is often true.
Without the sort, algorithm A is O(n) for n operators. Adding a sort makes it
O(n log n).
Algorithm B is O(n log n) anyway, and the constant factor involved in
manipulating a large map tends to be much worse than the one involved in
sorting. (One way to view this is that Algorithm B, as part of its job, performs
a sort of the unary operators, too: it sorts by inserting lots of things into a
map and then extracting them into a vector. Inserting into a map and then
extracting a vector is an O(n log n) sorting algorithm, but it tends to be much
slower than std::sort in terms of absolute time.)
BTW, unfortunately sorting by operator_no is not enough to give unary operators
a deterministic order because every operator can spawn many unary operators (and
the operator_no is not an ID for the unary operator but the original operator's
number, unless I'm much mistaken).
We would need a comparison function that also takes into account at least the
effect and the preconditions. (operator_no + effect is enough if we don't have
conditional effects, but not if we have them). Once we compare these three, we
might as well compare all four things that define the operator (operator_no,
precondition, effect, cost).
I think it would be good to compare ID first because two unary operators will
have different operator IDs 99% of the time, and then the test is done quickly.
An alternative could be to give unary operators a unique numerical ID, too, but
I'd much rather not. I'd be worried about the memory footprint; this is a hot
data structure in heuristic computation, and keeping it small can be key to
keeping the main data we need in the cache.
|