Santosh Bhavani issues

Results 12 issues of


                                            Santosh Bhavani

nan train loss while training SSD

Getting train_loss: nan at every step running training for SSD. I tried varying batch_size and learning_rate but still no improvement. InvalidArgumentError (see above for traceback): Nan in summary histogram for:...

Add COCO and VOC export

Even having simple Python scripts would be helpful

Create Dockerfile

[feature-request] Support for JAX container

*Concise Description:* I'd like to use JAX for distributed training of LLMs. In addition, the new release of Keras supports JAX as a backend in addition to TF. *Describe the...

Create README.md for examples/

# Description Our examples are split between examples/ and docs/examples/. We also have features (e.g. inference) hidden in our examples that would be worth summarizing in a single README. I'd...

Integration with TorchRec

Are there any plans to integrate the embedding_modules or custom samplers back into TorchRec?

[Enhancement] Add PPO/GRPO

I saw MLX is adding DPO, ORPO, PPO and GRPO. Any plans to add those to AXLearn as well?

Not compatible with Python 3.12

**Describe the bug** A clear and concise description of what the bug is. **Reproduction** 1. What command or script did you run? ```none pip install openmim ``` **Environment** Using Python...

cv2.version does not print the full version

### Expected behaviour Getting the version of opencv should allow a user to pip install that same version in a different environment ### Actual behaviour The version number attribute for...

gpt-oss implementation

## **Description** This outlines the current status of gpt-oss features that need to be implemented in Megatron Core, leveraging Transformer Engine. **✅ UPDATE: All core GPT-OSS functionality is now available...

enhancement