Bullet simulation synchronization between all instances creates issues
Since it was decided a long time ago that all instances of Sigma will run a Bullet simulation, the simulations must be synchronized to have the same state on all instances.
Bullet simulation state is incremental. Each step must be reproduced on all copies, ie the input data (the forces) must be propagated. It is not possible to remove a step from a copy or the copy simulation will be desynchronized.
Running a desynchronized simulation will lead to some major problems since the positions will not be the same and the player will see another world.
Here are the constraints of such implementation:
There should be an incremental step ID that identifies a specific state of the authoritative simulation, with a delta time. All copies must synchronize their state incrementally on the authoritative simulation stepping.
The simulation inputs (the forces) must be validated by the authoritative simulation and associated with a step ID and a delta time before being propagated to all the copies.
The copies must use the inputs given by the authoritative simulation and apply them incrementally, in the same order as they were applied on the authoritative server, using the step ID and the delta time.
There are some issues with the synchronization of the simulation:
- An instance must wait to receive all the data from the previous states before applying the data for a specific state. If it does not receive one of the previous step, or after an excessive delay, it must reboot the simulation, i.e request a copy of the model, which is "expensive".
- There is a network latency to take into account for each data. There can be an impact on actual step timing if a packet was delayed in the network, meaning that the game can "freeze" if only 1 packet is delayed because we can't run the simulation.
- Having a policy that states that we run the simulation even if there is a missing packet does not work since we won't run the same simulation.
Conclusion
Running a Bullet simulation on each instances of Sigma creates a strong dependency with the network layer that must deliver packets in time and in order with no packet loss.
- One undelivered packet will create a "reboot storm" since all instances will want to resynchronize their model.
- One delayed packet will freeze the world until it is received. This can take several frames.
For this reasons, i think that we should not run a Bullet simulation on each instances, but having only one authoritative simulation that runs for all.
Only the resulting positions will be synchronized, not the forces. If 1 packet is lost, we can skit it and update the model at the next frame. At 60 fps, this won't be noticeable.