Kokoro-FastAPI icon indicating copy to clipboard operation
Kokoro-FastAPI copied to clipboard

Add Intel XPU support

Open TheMrCodes opened this issue 11 months ago • 8 comments

Describe the feature you'd like Implementation for the Intel GPU backend via pytorch

Additional context This could also be used for integrated Arc GPUs (Arrow Lake and up)

Quite simple the whole thing because Pytorch already supports Intel XPUs as a backend. I already implemented a version with ipex (Intel extention for pytorch) support that uses nearly the exact code as CUDA GPUs https://github.com/TheMrCodes/Kokoro-FastAPI

TheMrCodes avatar Jan 31 '25 21:01 TheMrCodes

Oh this looks great. Will take a look as soon as I have the v1_0 support out

remsky avatar Feb 02 '25 10:02 remsky

This is great News does anyone have any concrete performance numbers? I'm thinking about buying a Mini-PC with an N100 for this.

gitchat1 avatar Feb 02 '25 17:02 gitchat1

@gitchat1 Sorry to disappoint but my implementation currently only supports integrated Xe (so Intel Core Ultra line) and dedicated Intel Arc GPUs using ipex (the Intel Extension for Pytorch) For CPU the ONNX implementation is the way to go, don't know if there are performance benchmarks currently for CPUs with AVX2

TheMrCodes avatar Feb 02 '25 17:02 TheMrCodes

My CPU does have AVX2 support is okay but it's not overwhelmingly quick. I was hoping to get a little speedboost without having to spend too much money.

gitchat1 avatar Feb 02 '25 18:02 gitchat1

Note: This could be further optimized and added supported for more Intel Hardware specifics if OpenVINO is implemented. But as far as I know the whole model has to be Torch JIT Script compatible (they use torch.jit.trace) to be aber to convert it. If converted into their IR it can be user as ONNX or directly in torch (with openvino backend executor) for more performence even on CPUs.

TheMrCodes avatar Feb 07 '25 09:02 TheMrCodes

+1 for interest in this. I think Kokoro could potentially run great even on low-power Intel iGPUs/NPUs on laptops. I have been having issues trying to get it to run via the OpenVINO onnx-runtime but getting errors related to tensor shapes (could just be a skill issue on my side!). I think the IPEX route should work, but trying MrCodes above fork I can get torch xpu to confirm it can see Intel iGPU/NPU, but I get undefined symbol: _ZNK5torch8autograd4Node4nameEv error when trying to run Kokoro (probably also skill issue). Having a container that runs with IPEX, particularly on NPU would be great. I came across https://github.com/ellenhp/whisper-npu-server/ which runs Whisper on Intel NPU. Been trying to setup the same for Kokoro but out of my depth and can't get it running properly. Would be awesome to have it supported here!

spacewed avatar Feb 18 '25 22:02 spacewed

any news to this ?! would be great !! ;)

Bumoch avatar May 09 '25 22:05 Bumoch

@TheMrCodes , My Intel CPU has an integrated UHD graphics 630. Will your version be able to offload the processing to the iGPU? I cloned your repo but the docker build fails throwing an error saying the "gpu" variable is not set in optional-dependencies.

I'm not even sure whether using the GPU instead of CPU in my case makes any significant difference or not, but still I was wondering if you could help me to make the necessary adjustments in the docker file configuration and the docker compose to at least test it.

Thanks in advance

rockstar2020 avatar May 17 '25 04:05 rockstar2020