Continuous-CBS icon indicating copy to clipboard operation
Continuous-CBS copied to clipboard

Question about solution sensitivity

Open gsalinas opened this issue 1 year ago • 3 comments

Hi!

Thank you for the very helpful paper & repository. This may not be a bug per se, but I couldn't quite make sense of it and I hoped someone might be able to help.

Consider the following problem, where A and B are two agent start positions, a and b are their goals, respectively, and ( ) is a generic node on the roadmap:

(A)
 |        <-- paths of length 1
(B)        
 |
( ) ---- some arbitrary path length greater than 1 ---- ( )
 |
(b)
 |
(a)

For the roadmap, where these edges are the only ones available, the only solution is for B to move to the side ("alcove"), allowing A to pass down to its goal, and then for B to return from the alcove and proceed to its goal. I believe this should be the optimal solution regardless of the length of the path from the hallway to the alcove, (unless moving partway along an edge and then reversing direction is permitted.)

As it happens, I was examining this scenario and discovered that the CCBS implementation is strangely sensitive to the length of the path from hallway to alcove. In my scenario, the vertical distances in the diagram above are all 1. When the horizontal distance is below some threshold, the solver finds a solution very quickly (<200ms) but when the distance crosses above the threshold, 30s becomes insufficient to find a solution. Is this an expected behavior of CBS or SIPP? Can you help me with an intuition as to why it might be happening?

I did a rough binary search and found that the threshold for my problem instance is between 7.26 and 7.27 but I can't figure out what's special about that number.

Thank you very much for any help you can provide regarding this!

./CCBS Examples/alcove_corridor_solves_quickly.xml Examples/alcove_corridor_task.xml Examples/config.xml 
Soulution found: true
Runtime: 0.130823
Makespan: 16.52
Flowtime: 20.52
Initial Cost: 6
Collision Checking Time: 0.0716953
HL expanded: 5149
LL searches: 24954
LL expanded(avg): 7.3278

./CCBS Examples/alcove_corridor_does_not_solve.xml Examples/alcove_corridor_task.xml Examples/config.xml 
Soulution found: false
Runtime: 30.0091
Makespan: 10.4142
Flowtime: 20.5355
Initial Cost: 6
Collision Checking Time: 12.0619
HL expanded: 13334
LL searches: 57764
LL expanded(avg): 8.83807

The XML descriptions of the task I'm doing, and the two versions of the roadmap where it does / doesn't solve quickly, are linked here: https://gist.github.com/gsalinas/f6145a31c2087037f12462e0886f2bcb

gsalinas avatar Oct 31 '24 00:10 gsalinas

Hi, Gerry! Thanks for so explained issue with examples. In general, this is a standard behavior of any CBS-like approach. While the number of edges/nodes in your example doesn't change, we are still able to produce infinite amount of states as we have another dimension - time. By extending the length of the edge e7 which is used by one of the agents to avoid collision, you also extend the cost of the collision-free solution. As a result, CCBS is able to create more and more partial solutions with less cost than the collision-free one. It's still looks wierd that the number of expansions and runtime blows up due to so small change in the edge cost of e7. I still haven't figured out why it happens. Just to be sure that we are testing/working with the same code - do you use the last version of the code available in the master branch?

aandreychuk avatar Nov 03 '24 09:11 aandreychuk

Yes, I built from the latest master, at b8f45e166cf39ee7e8e06f0bfe4fbab0dee68932. Incidentally, I ran it again increasing the runtime limit to 2.5 hrs (9000 seconds) and it still fails to find a solution in that case.

gsalinas avatar Nov 04 '24 18:11 gsalinas

The other thing I've noticed, perhaps not surprisingly, is that the sensitivity of when it crosses over from quickly solvable to seemingly unsolvable / much much much slower, does depend on use_cardinal and use_disjoint_splitting. I tried a few different configurations - the two files above, plus a third where the cost of the alcove is much lower, 3.0 instead of 7.26, and got the following:

use_cardinal use_disjoint_splitting alcove cost: 3 alcove cost: 7.26 alcove cost: 7.27
true true 5ms 200ms >10sec
true false 100ms >10sec >10sec
false true 3ms 130ms >10sec
false false 26ms >10sec >10sec

So it seems that at least between 7.26 and 7.27, disjoint splitting is what enables it to find a solution, but it's still surprising to me that it blows up so completely past that point.

gsalinas avatar Nov 04 '24 19:11 gsalinas