Santosh Bhavani

Results 12 issues of Santosh Bhavani

Getting train_loss: nan at every step running training for SSD. I tried varying batch_size and learning_rate but still no improvement. InvalidArgumentError (see above for traceback): Nan in summary histogram for:...

Even having simple Python scripts would be helpful

*Concise Description:* I'd like to use JAX for distributed training of LLMs. In addition, the new release of Keras supports JAX as a backend in addition to TF. *Describe the...

# Description Our examples are split between examples/ and docs/examples/. We also have features (e.g. inference) hidden in our examples that would be worth summarizing in a single README. I'd...

Are there any plans to integrate the embedding_modules or custom samplers back into TorchRec?

I saw MLX is adding DPO, ORPO, PPO and GRPO. Any plans to add those to AXLearn as well?

**Describe the bug** A clear and concise description of what the bug is. **Reproduction** 1. What command or script did you run? ```none pip install openmim ``` **Environment** Using Python...

### Expected behaviour Getting the version of opencv should allow a user to pip install that same version in a different environment ### Actual behaviour The version number attribute for...

## **Description** This outlines the current status of gpt-oss features that need to be implemented in Megatron Core, leveraging Transformer Engine. **✅ UPDATE: All core GPT-OSS functionality is now available...

enhancement