Cong Chen comments

Repositories
Issues
Comments

Results 3 comments of


                                            Cong Chen

Precision Problem between nemo model and hugging face model

> Hi, we are aware that some TE implementations won't generate identical results to those of HF (which uses native PyTorch). We use our fused version of operations to speed...

Getting error when set use_cache as False in generation

> +1, the same problem. In my cases, the error occurs when training with deepspeed zero3_offload and multi-image inputs (generating completions for GRPO), but it seems ok during evaluation and...

Getting error when set use_cache as False in generation

> Yes, that's right [@fushh](https://github.com/fushh), I think it is an issue related to the huggingface version. Hello, what's your version of huggingface? Thanks