tiny-gpu
tiny-gpu copied to clipboard
On branch divergence
In real GPUs, individual threads can branch to different PCs, causing branch divergence where a group of threads threads initially being processed together has to split out into separate execution.
It's simpler than that: the execution continues for both branches and a mask per thread controls if you store the computed value of that thread or not.