Don't assume all devices with the same runtime have compatible binaries
python examples\beautiful_mnist_multigpu.py fails on my machine. After adding some logging in opg_gpu.py I got this output: Error: The program ISA amdgcn-amd-amdhsa--gfx1103 is not compatible with the device ISA amdgcn-amd-amdhsa--gfx1102
I'm running this under windows, my laptop has two GPUs: an integrated AMD Radeon(TM) 780M and also an AMD Radeon(TM) RX 7700S. Looks like the ISAs for both aren't compatible, so an OpenCL program compiled for one will not run on the other.
Changes
Name Lines Diff Tokens/Line Diff
-------------------------- ------- ------ ------------- ------
tinygrad/engine/realize.py 151 +0 14.8 +0.1
total lines changes: 0
Needs tests
What this needs is a (fake?) device that uses a cachekey for the Compiler that's different for each instance. Ideally with some overlap to also verify that, so "FAKE:0" -> cachekey "0" "FAKE:1" -> cachekey "1" "FAKE:2" -> cachekey "0"
This compiler should make sure that the cachekey is part of the returned "compiled" bytes, so that the the runtime can verify it's called correctly. To round it out the fake compiler can be used to count how often it compiles, which is a good check that the cache actually works as intended.
I just found some code in test/external/external_test_speed_llama.py overriding part of the default device, I think I can get something similar to work for testing this change.
stale