Issue110

Title Selective Max sometimes does not find optimal solutions
Priority bug Status resolved
Superseder Nosy List erez, gabi, malte
Assigned To erez Keywords
Optional summary

Created on 2010-08-09.18:41:33 by erez, last changed by erez.

Messages
msg453 (view) Author: erez Date: 2010-08-10.13:07:37
fixed in r4713
msg452 (view) Author: erez Date: 2010-08-10.11:46:24
I solved the bug!
It was caused by not resetting the landmarks heuristic after the sampling done 
by selective max. This caused the landmark status information to (sometimes) be 
incorrect, since it is based on StateProxy, which is based on pointers.

I did not check in the fix yet, since I have the exact timers also.
If there are no objections to the exact timers, I will check everything in.
Otherwise, I will separate the changes - let me know.
msg451 (view) Author: erez Date: 2010-08-10.10:29:46
It seems that the problem is indeed with the landmarks heuristic.
In the incorrect runs, a landmark is marked as unachieved, while in the correct 
runs it is achieved.

I will try to locate the cause of this, but it might be a good idea to switch to 
the implementation in the emil-new branch. Any thoughts?
msg450 (view) Author: erez Date: 2010-08-10.09:58:11
After some more investigation, it seems that the lmcount(admissible=true) 
heuristic, sometimes returns a value of 12 (the correct runs), and sometimes a 
value of 15 (incorrect runs), for the same state, with the same history.

I will keep looking into this (maybe try reproducing this with landmarks 
heuristic only)?
msg445 (view) Author: erez Date: 2010-08-09.18:41:32
On blocks-9-0, running ./downward --random-seed 1 --search 
"astar(selmax(lmcount(admissible=true),lmcut()))" < output

sometimes leads to finding the optimal solution (30), but sometimes leads to the 
following output:

Simplifying transitions... done!                                                                                                                             
Initializing heuristic cache... done!                                                                                                                        
random seed 1                                                                                                                                                
Initializing Exploration...                                                                                                                                  
Reading invariants from file...                                                                                                                              
done                                                                                                                                                         
Generating landmarks using the RPG/SAS+ approach                                                                                                             
Discovering action landmarks                                                                                                                                 
Discovered 16 action landmarks                                                                                                                               
Removed 0 reasonable or obedient reasonable orders                                                                                                           
Landmarks generation time: 0s                                                                                                                                
Discovered 40 landmarks, of which 0 are disjunctive                                                                                                          
          148 edges                                                                                                                                          
Discarding 0 disjunctive landmarks                                                                                                                           
Initializing landmarks count heuristic...                                                                                                                    
19 initial landmarks, 8 goal landmarks                                                                                                                       
Conducting best first search with reopening closed nodes, bound = 2147483647                                                                                 
Initializing Selective Max                                                                                                                                   
Number of features: 19                                                                                                                                       
Done Initializing Selective Max                                                                                                                              
Beginning Training                                                                                                                                           
Initializing landmark cut heuristic...                                                                                                                       
Starting Training Example Collection                                                                                                                         
Probe state space sample                                                                                                                                     
Training Example Collection Finished                                                                                                                         
Branching Factor: 2.53386                                                                                                                                    
Training Set Size: 100                                                                                                                                       
Time 0 - 2.73033e+07                                                                                                                                         
Time 1 - 6.86566e+07                                                                                                                                         
Beginning Training Classifier                                                                                                                                
Thresholds                                                                                                                                                   
Training pair: 0, 1                                                                                                                                          
1 <=> 1 - 0 > 0.991789                                                                                                                                       
Finished Training Classifier                                                                                                                                 
f = 16 [1 evaluated, 0 expanded, t=0.15s]                                                                                                                    
Best heuristic value: 16 [g=0, 1 evaluated, 0 expanded, t=0.15s]                                                                                             
f = 17 [3 evaluated, 1 expanded, t=0.15s]                                                                                                                    
f = 18 [6 evaluated, 3 expanded, t=0.15s]                                                                                                                    
f = 19 [9 evaluated, 6 expanded, t=0.15s]                                                                                                                    
f = 20 [16 evaluated, 9 expanded, t=0.15s]                                                                                                                   
Best heuristic value: 15 [g=5, 20 evaluated, 12 expanded, t=0.15s]                                                                                           
f = 21 [31 evaluated, 18 expanded, t=0.15s]                                                                                                                  
f = 22 [57 evaluated, 30 expanded, t=0.15s]                                                                                                                  
f = 23 [130 evaluated, 67 expanded, t=0.16s]                                                                                                                 
Best heuristic value: 14 [g=9, 135 evaluated, 72 expanded, t=0.16s]                                                                                          
f = 24 [231 evaluated, 115 expanded, t=0.17s]                                                                                                                
f = 25 [324 evaluated, 176 expanded, t=0.19s]                                                                                                                
Best heuristic value: 13 [g=11, 654 evaluated, 323 expanded, t=0.23s]                                                                                        
f = 26 [727 evaluated, 357 expanded, t=0.24s]                                                                                                                
f = 27 [1236 evaluated, 587 expanded, t=0.3s]                                                                                                                
f = 28 [3056 evaluated, 1393 expanded, t=0.52s]                                                                                                              
f = 29 [6154 evaluated, 2683 expanded, t=0.89s]                                                                                                              
Best heuristic value: 12 [g=15, 6236 evaluated, 2721 expanded, t=0.9s]                                                                                       
f = 30 [16488 evaluated, 7000 expanded, t=2.11s]                                                                                                             
f = 31 [40783 evaluated, 16031 expanded, t=5.05s]                                                                                                            
f = 32 [104347 evaluated, 41093 expanded, t=12.69s]                                                                                                          
f = 33 [237401 evaluated, 98162 expanded, t=29s]                                                                                                             
Best heuristic value: 11 [g=19, 237889 evaluated, 98347 expanded, t=29.05s]                                                                                  
f = 34 [472978 evaluated, 212449 expanded, t=58.11s]                                                                                                         
f = 35 [892478 evaluated, 440232 expanded, t=108.84s]                                                                                                        
Best heuristic value: 10 [g=21, 893759 evaluated, 440746 expanded, t=108.95s]                                                                                
^CPeak memory: 1234024 KB             

Although the random seed is fixed, there is some randomness in the timing.
However, the threshold (which is the only thing that the timing is needed for) 
is always between 0.97 and 0.99, which means the runs should be the same.

Any ideas?
History
Date User Action Args
2010-08-10 13:07:37erezsetstatus: need-eg -> resolved
messages: + msg453
2010-08-10 11:46:24erezsetmessages: + msg452
2010-08-10 10:29:46erezsetmessages: + msg451
2010-08-10 09:58:11erezsetmessages: + msg450
2010-08-09 18:41:33erezcreate