[FIX] can_bus input of quaternion mistakenly get all 4 number as same…
Issue
A subtle bug was discovered in the NuScenes dataset loader where the canbus input for the BEV head incorrectly assigns quaternion values for ego vehicle rotation. Specifically, when assigning the quaternion (positions 4-8), all components receive the same value as rotation.w.
Reproduction Example
In [1]: import pyquaternion
In [2]: import numpy as np
...:
In [3]: q=pyquaternion.Quaternion([1,0,0,0])
In [4]: b=np.zeros([10])
In [5]: b
Out[5]: array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
In [6]: b[3:7]=q
In [7]: b
Out[7]: array([0., 0., 0., 1., 1., 1., 1., 0., 0., 0.])
Fix
The solution is to use the quaternion's underlying numpy array for assignment:
In [8]: b[3:7]=q.q
In [9]: b
Out[9]: array([0., 0., 0., 1., 0., 0., 0., 0., 0., 0.])
Why the Bug Remained Undetected
Two main factors contributed to the delayed discovery:
- The canbus data includes yaw angles (in degrees and radians) as the last two values, which are the most critical values for planning within rotation
- Since all dataset items contained the same error, models overfit to this incorrect data pattern
This allowed the model to still function in open-loop evaluation scenarios where the same dataloader was used.
Discovery Process
We tried to integrate the VAD model into our simulation pipeline but faced an issue: no matter how we supplied the rotation data from the simulator, the model couldn't generate the correct turning trajectory. To troubleshoot, we inspected the training pipeline line by line and discovered the problem. We confirmed that the VAD model behaves correctly only when we input 'w, w, w, w' as the rotation in the canbus during simulation.
Impact
We conclude that this issue will not affact the open-loop evaluation. But if people try to use the model in the close-loop environments, like simualtion or real vehicle, it will show up.
We traced the source of this issue, and it's first appearance is well-adopted BEVFormer code base. And this issue still existed there. Which means all work derived from BEVFormer share the same problem.
Required Actions
- Existing checkpoints will not work correctly
- Models need to be retrained with the fixed dataloader
- Systems using this in closed-loop environments need to be updated
@yjmade
Hi, Do you know wherecan_bus[3:7] is being used?
it seems like only can_bus[:3], can_bus[-1] and can_bus[-2] is being used, so a wrong can_bus[3:7] probably doesn't affect anything?
Ok, I found it.
Except using can_bus as information for pre/post processing, it is also treat as a input data of neural network.
So it make sense when close loop evaluation in simulator won't work.
can_bus = self.can_bus_mlp(can_bus)[None, :, :]