Title Implementing distance based tie breaking in search
Priority feature Status chatting
Superseder Nosy List augusto, gabi, jendrik, malte, mkatz, silvan
Assigned To Keywords
Optional summary
An implementation of distance (number of 
actions) Evaluator to be used with 

Created on 2020-07-28.19:00:20 by mkatz, last changed by malte.

An implementation of distance (number of 
actions) Evaluator to be used with 
File name Uploaded Type Edit Remove malte, 2020-07-29.12:52:59 text/x-c++src
msg9704 (view) Author: malte Date: 2020-07-29.15:26:06
Yes, sorry!
msg9703 (view) Author: mkatz Date: 2020-07-29.14:24:15
Thank you! That was what I was going for.
Should it be "prototype_g(transform=adapt_costs(cost_type=one))" instead?
msg9702 (view) Author: malte Date: 2020-07-29.12:56:08
PS: use the evaluator "prototype_g(transform=adapt_costs(cost_type=normal))" to get what you described.

But let me emphasize that you trust this evaluator at your own risk -- the interactions of search, open lists and evaluator value updates are quite subtle.
msg9701 (view) Author: malte Date: 2020-07-29.12:52:59
I'm attaching an implementation of such an evaluator, but it comes with two caveats. One of them is the question of "what exactly is the g value?" I brought up in this issue discussion, the other one is a bit more technical about the interaction between search algorithms, open lists and evaluators. As a first approximation, you can think of it as the question of how to implement reopening in the presence of path-dependent evaluators. See the code for more, both caveats are discussed there in a comment.

For the tie-breaking purposes you describe, I think this might work well enough, although it's not properly tested. In addition to adding the file to the src/search/evaluators directory, also add the following plugin description to src/search/DownwardFiles.cmake:

    HELP "The prototype g-evaluator"

(The dependencies are probably incomplete, but oh well.)

In order for something like this to be mergeable, we need to address the two caveats. Perhaps they can be partially addressed through documentation, but the one related to reopening looks more subtle and may require additional thought.

Additionally, there is some ugliness regarding how exactly this uses the heuristic cache mechanism that should be addressed before merging something like this, but at least this is at the level of implementation ugliness, not unexpected behaviour.
msg9698 (view) Author: malte Date: 2020-07-29.09:50:55
Hi Michael, that can mean many things, some of them easily doable, some of them not. :-) Anyway, I think we're on the same page now.

I think it should be quite easy to do a prototype of this. A better solution would be to use this opportunity to make the change to the g/real_g code in the search algorithms that we've wanted to do for a long time and that would give us some memory savings for most planner configurations. But let's do the prototype first.
msg9695 (view) Author: mkatz Date: 2020-07-28.23:44:20
Yes, that should work. 
Note my original text "An implementation of distance (number of actions) Evaluator 
to be used with TieBreakingOpenList." :)
msg9694 (view) Author: malte Date: 2020-07-28.22:47:49
Getting exactly B could be somewhat difficult; it would be easier to do reopening based on g_cost, but update g_dist values upon expansions based on the best g_dist values seen so far for every state. I think this could easily be implemented in the existing evaluator framework with no changes to existing code, only a new evaluator added.

This would give g_dist values that always fall somewhere in between the minimum g_dist value possible for this state and the g_dist value that corresponds to the actual path found that minimizes g_cost. So in my example, the g_dist value would be either 3 or 4, depending on the order in which different nodes were expanded. Would this be acceptable?
msg9693 (view) Author: mkatz Date: 2020-07-28.22:42:57
In the use case I had in mind, option B 
should work. I would like to use f_cost 
for guiding the best-first search, 
breaking ties by g_dist, h_cost.
msg9692 (view) Author: malte Date: 2020-07-28.22:32:32
I see, I didn't realize that this was about distance from the initial state, not (estimated) distance from the goal.

It is currently possible to use a g evaluator based on distance or a g evaluator based on cost, but not both at the same time. This would be a useful addition though, and a stepping stone towards getting rid of the "internal" use of two kinds of g-evaluators within the search algorithms, which has been a goal for some time.

There are some non-trivial semantic questions due to the path-dependence of g-values. Let's say that, in a search with reopening, you want to use both a "unit-cost g evaluator" g_dist and a "real-cost g evaluator" g_cost in the same search. Consider a state that can be reached on two paths: one with g_dist = 3 and g_cost = 10, and another one with g_dist = 4 and g_cost = 8. What should the g values of the state be?

A) g_dist = 3, g_cost = 10?
B) g_dist = 4, g_cost = 8?
C) g_dist = 3, g_cost = 8?

The problem with A is that you don't get the cheapest path w.r.t. cost.
The problem with B is that you don't get the cheapest path w.r.t. distance.
The problem with C is that the g values are not consistent with each other: there is no single path that has both g_dist = 3 and g_cost = 8.

In your use case, would one of the answers work while the others wouldn't?

*Some* version of this would be comparatively simple to implement. If you need a very specific spec, it might be much more complicated. What do you need?
msg9691 (view) Author: mkatz Date: 2020-07-28.20:09:27
Does the current g evaluator implementation allows defining a separate g evaluator 
with cost transformation to unit costs? I would like to e.g., run A* with f, ties 
broken by distance, then h.
msg9690 (view) Author: augusto Date: 2020-07-28.19:52:42
As I understood it, the feature would be to use the g-evaluator considering all action costs as 1. Is this also already possible?
msg9688 (view) Author: malte Date: 2020-07-28.19:43:05
Hi Michael, can you explain what is missing?

TieBreakingOpenList can take any evaluator, and all heuristics can be configured to ignore action costs (= set all action costs to 1), which is my interpretation of a distance-based evaluator in the sense given in the summary.
Date User Action Args
2020-07-29 15:26:07maltesetmessages: + msg9704
2020-07-29 14:24:15mkatzsetmessages: + msg9703
2020-07-29 12:56:08maltesetmessages: + msg9702
2020-07-29 12:52:59maltesetfiles: +
messages: + msg9701
2020-07-29 09:50:55maltesetmessages: + msg9698
2020-07-28 23:44:20mkatzsetmessages: + msg9695
2020-07-28 22:47:49maltesetmessages: + msg9694
2020-07-28 22:42:57mkatzsetmessages: + msg9693
summary: An implementation of distance (number of actions) Evaluator to be used with TieBreakingOpenList. -> An implementation of distance (number of actions) Evaluator to be used with TieBreakingOpenList.
2020-07-28 22:32:32maltesetmessages: + msg9692
2020-07-28 20:09:27mkatzsetmessages: + msg9691
2020-07-28 19:52:42augustosetnosy: + augusto
messages: + msg9690
2020-07-28 19:43:43silvansetnosy: + silvan
2020-07-28 19:43:05maltesetstatus: unread -> chatting
nosy: + malte, jendrik
messages: + msg9688
2020-07-28 19:00:20mkatzcreate