Cristian Vicas

Results 9 comments of Cristian Vicas

Found this: https://github.com/home-assistant/home-assistant/issues/16708#issuecomment-422902879 So, from python, I started the server using this configuration: ```@asyncio.coroutine def broker_coro(): config = { 'listeners': { 'default': { 'max-connections': 50000, 'bind': '0.0.0.0:1883', 'type': 'tcp', },...

I had some pesky issues with deepspeed and flash attention. On Ubuntu, with an env where I had only pytorch+huggingface+accelerate. What I did is re-create the env with cuda, cudatoolkit...

What I also tried: - Re downloaded the models - Created configs using ``accelerate config``. - Tried different combinations of with/w/o FSDP, (deepspeed required some non trivial install, I skipped...

This might be a different issue but I tried to do ``accelerate test`` with and without distributed computing enabled. Running with this config will block the test: ``` - `Accelerate`...

2x 4080 Super 16Gb on PICe 8x. Driver Version: 545.29.06 (as reported by nvidia-smi): GIGABYTE GeForce RTX 4080 SUPER WINDFORCE V2 16GB GDDR6X 256-bit DLSS 3.0 GIGABYTE GeForce RTX 4080...

I ran the above script in the cuda=11.8 env described here https://github.com/huggingface/accelerate/issues/2812#issuecomment-2139409879: ``` $ python Python 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0] on...

Nope, but I have a strong clue: I have nvidia 545.29.06 drivers ("closed' version). I've done some reading and I *think*: - the 545 (or more versions) have a bug...

This is not stale yet, but there will be some weeks before I manage to play around with benchmarks and esp nvidia drivers.

Update: I was "forced" by Ubuntu to upgrade the NVIDIA driver. On 550.107.02 without any other software intentionally installed (eg nccl), things behave as expected. So it was something to...