axlearn icon indicating copy to clipboard operation
axlearn copied to clipboard

Neuron support in Axlearn

Open apoorvtintin opened this issue 1 year ago • 3 comments

This PR enables use of neuron devices in Axlearn for model training.

  • Chooses correct mesh for TRN devices for Fuji 7B with the mesh selector flag --mesh_selector=neuron-trn1.32xlarge-64

apoorvtintin avatar Jul 01 '24 20:07 apoorvtintin

@apoorvtintin I see this PR is quite stale for sometime. If no objection, I'd like to have @Ruixuan who is working on Trn from our end to port your change and continue iterate it?

kelvin-zou avatar Jul 09 '24 15:07 kelvin-zou

@apoorvtintin I see this PR is quite stale for sometime. If no objection, I'd like to have @Ruixuan who is working on Trn from our end to port your change and continue iterate it?

Apoorv is on PTO right now. I am OK with you all taking over this PR. Can you add us as a reviewer when you finish? Thanks

patrick-toulme avatar Jul 09 '24 17:07 patrick-toulme

Thanks for all the reviews, I fixed most of the comments on the PR.

apoorvtintin avatar Jul 24 '24 23:07 apoorvtintin

Is this PR still needed?

kelvin-zou avatar Jan 08 '25 17:01 kelvin-zou

Not needed, closing this

apoorvtintin avatar Jan 12 '25 00:01 apoorvtintin