huangk10

Results 4 comments of huangk10

got ModuleNotFoundError: No module named 'flash_attn' error

The code throws this error after commenting out the flash atten import. “”“ work = group.broadcast([tensor], opts) RuntimeError: create:build/CMakeFiles/torch_npu.dir/compiler_depend.ts:91 HCCL function error: HcclCommInitRootInfo(numRanks, &rootInfo, rank, &(comm->hcclComm_)), error code is 2...

work = group.broadcast([tensor], opts) RuntimeError: create:build/CMakeFiles/torch_npu.dir/compiler_depend.ts:91 HCCL function error: HcclCommInitRootInfo(numRanks, &rootInfo, rank, &(comm->hcclComm_)), error code is 2 [ERROR] 2025-02-10-19:56:18 (PID:1057704, Device:0, RankID:1) ERR02200 DIST call hccl api failed. > >...

does this pr work on multi nodes?