Issue

A subtle bug was discovered in the NuScenes dataset loader where the canbus input for the BEV head incorrectly assigns quaternion values for ego vehicle rotation. Specifically, when assigning the quaternion (positions 4-8), all components receive the same value as rotation.w.

Reproduction Example

In [1]: import pyquaternion

In [2]: import numpy as np
   ...: 

In [3]: q=pyquaternion.Quaternion([1,0,0,0])

In [4]: b=np.zeros([10])

In [5]: b
Out[5]: array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [6]: b[3:7]=q

In [7]: b
Out[7]: array([0., 0., 0., 1., 1., 1., 1., 0., 0., 0.])

Fix

The solution is to use the quaternion's underlying numpy array for assignment:

In [8]: b[3:7]=q.q

In [9]: b
Out[9]: array([0., 0., 0., 1., 0., 0., 0., 0., 0., 0.])

Why the Bug Remained Undetected

Two main factors contributed to the delayed discovery:

The canbus data includes yaw angles (in degrees and radians) as the last two values, which are the most critical values for planning within rotation
Since all dataset items contained the same error, models overfit to this incorrect data pattern

This allowed the model to still function in open-loop evaluation scenarios where the same dataloader was used.

Discovery Process

We tried to integrate the VAD model into our simulation pipeline but faced an issue: no matter how we supplied the rotation data from the simulator, the model couldn't generate the correct turning trajectory. To troubleshoot, we inspected the training pipeline line by line and discovered the problem. We confirmed that the VAD model behaves correctly only when we input 'w, w, w, w' as the rotation in the canbus during simulation.

Impact

We conclude that this issue will not affact the open-loop evaluation. But if people try to use the model in the close-loop environments, like simualtion or real vehicle, it will show up.

We traced the source of this issue, and it's first appearance is well-adopted BEVFormer code base. And this issue still existed there. Which means all work derived from BEVFormer share the same problem.

Required Actions

Existing checkpoints will not work correctly
Models need to be retrained with the fixed dataloader
Systems using this in closed-loop environments need to be updated

Nov 17 '24 15:11 yjmade

@yjmade Hi, Do you know wherecan_bus[3:7] is being used? it seems like only can_bus[:3], can_bus[-1] and can_bus[-2] is being used, so a wrong can_bus[3:7] probably doesn't affect anything?

Jan 08 '25 10:01 kaitolucifer

Ok, I found it. Except using can_bus as information for pre/post processing, it is also treat as a input data of neural network. So it make sense when close loop evaluation in simulator won't work. can_bus = self.can_bus_mlp(can_bus)[None, :, :]

Jan 09 '25 07:01 kaitolucifer

[FIX] can_bus input of quaternion mistakenly get all 4 number as same…

Issue

Reproduction Example

Fix

Why the Bug Remained Undetected

Discovery Process

Impact

Required Actions