IAN
IAN
In this paper, the data enhancement is to flip the action and add it to the test set. In my understanding, this method of data expansion will lead to poor...
# Bug Report ### Is the issue related to model conversion? ### Describe the bug `File "tools/split_onnx_by_node.py", line 94, in cut_model onnx.utils.extract_model(onnx_file_path, onnx_file_path + '.gegelu.onnx', ["/down_blocks.0/attentions.0/transformer_blocks.0/norm3/Mul"], slice_node.output, check_model=False) File "/home/mlperf/miniconda3/envs/mlperf/lib/python3.8/site-packages/onnx/utils.py",...
**Describe the bug** I tried to replace half_t with bfloat16_t in examples/47_ampere_gemm_universal_streamk/ampere_gemm_universal_streamk.cu, but encountered compilation errors. **Steps/Code to reproduce bug** here is the diff  here is the part of...
deepseek-r1 8*H200 For some reason, one of the 8 processes reached self.move_ready_grammar_requests(), causing inconsistent communication and resulting in a hang.  