mollon650 comments

Results 5 comments of


                                            mollon650

Need help converting FastSpeech model to ONNX to run on Tensor RT

@jinfagang can u show your code how to conver the model to onnx， thanks

[REQUEST] what‘s the difference of pipeline Parallelism between deepspeed and megatron?

@siddharth9820 thanks for your reply，I have another question about the code， `if not fp16_master_weights_and_gradients: self.single_partition_of_fp32_groups.append(self.parallel_partitioned_bit16_groups[i][partition_id].to( self.device).clone().float().detach()) else: self.single_partition_of_fp32_groups.append(self.parallel_partitioned_bit16_groups[i][partition_id].to( self.device).clone().half().detach()) self.single_partition_of_fp32_groups[ i].requires_grad = True # keep this in case internal optimizer...

mollon650

Need help converting FastSpeech model to ONNX to run on Tensor RT

[REQUEST] what‘s the difference of pipeline Parallelism between deepspeed and megatron?

In order to be compatible with iree-turbine, make iree-turbine can support training

[QUESTION] why WrappedTorchLayerNorm sequence parallel not supported by torch LayerNorm？

[QUESTION] why WrappedTorchLayerNorm sequence parallel not supported by torch LayerNorm？