housebaby
housebaby
@danpovey thanks for your quick reply. Can you give more details on what limits kaldi to support multiple GPUs. We make some modifacation on cu-device.cc and wrap a decoder with...
@btiplitz thanks a lot. can you give more details on which part of nvidia code need to change? is the code of the lib like libcuda.* libcublas* ?
> We're currently looking at sharing the allocator, but that's mostly for a single GPU configuration. > For multi-gpu, given the fact that no communication between devices is needed, why...
@hugovbraun Actually, I have no idea about whether cu-device.* files will still be used in kaldi10
> We're currently looking at sharing the allocator, but that's mostly for a single GPU configuration. > For multi-gpu, given the fact that no communication between devices is needed, why...
> @housebaby are you looking to use one GPU per stream? Or one CPU thread per stream? If that's the case, I would strongly suggest to take a look at...
> @housebaby have you run nvidia-smi on the loading during testing. The 2nd GPU would only help if you max out the gpu. And Dan makes a point on multiple...
> @housebaby are you looking to use one GPU per stream? Or one CPU thread per stream? If that's the case, I would strongly suggest to take a look at...
> Several of the parameters effect accuracy, like the lattice so it seems combining those in a performance test is a mistake. The code stuffs data into a queue and...
> You're right, the current neural net context switch mechanism of the online pipeline has been designed for CNN-based networks. > > Regarding relying on the inner state of a...