huangk10
huangk10
got ModuleNotFoundError: No module named 'flash_attn' error
The code throws this error after commenting out the flash atten import. “”“ work = group.broadcast([tensor], opts) RuntimeError: create:build/CMakeFiles/torch_npu.dir/compiler_depend.ts:91 HCCL function error: HcclCommInitRootInfo(numRanks, &rootInfo, rank, &(comm->hcclComm_)), error code is 2...
work = group.broadcast([tensor], opts) RuntimeError: create:build/CMakeFiles/torch_npu.dir/compiler_depend.ts:91 HCCL function error: HcclCommInitRootInfo(numRanks, &rootInfo, rank, &(comm->hcclComm_)), error code is 2 [ERROR] 2025-02-10-19:56:18 (PID:1057704, Device:0, RankID:1) ERR02200 DIST call hccl api failed. > >...
does this pr work on multi nodes?