chakra icon indicating copy to clipboard operation
chakra copied to clipboard

more traces?

Open liecn opened this issue 1 year ago • 7 comments

Please provide a detailed description of your question or the information you seek.

Hi,

Could you please share more ET traces, such as the LLaMA traces you mentioned in previous issues?

Currently, I only have the converted traces from Astra-sim 1.0 and the Megatron trace mentioned in issue #176.

It would be really helpful if you could share more traces.

Thanks!

liecn avatar May 20 '24 03:05 liecn

Additionally, it appears that the functionality to parse the text files Transformer_HybridParallel.txt and Transformer_HybridParallel_Fwd_In_Bckwd.txt is missing.

liecn avatar May 21 '24 20:05 liecn

These text files are an artifact of ASTRA-sim 1.0 and not Chakra.

The best way to get these traces is collect it by running PyTorch model and enabling the profiler. Are you looking for instructions to collect yourself?

srinivas212 avatar May 22 '24 02:05 srinivas212

Thank you for your response. I appreciate the instructions on the Wiki and find them clear.

I'm just interested in whether I could obtain the measured traces from your end, particularly those involving many nodes, as they would be highly beneficial for my simulation.

liecn avatar May 22 '24 02:05 liecn

Yes, I understand!

What scale are you looking at?

We are updating comms group info in pytorch and collecting few traces. I will check and see if we can share externally. We do want to eventually setup a DB of traces but hosting the DB and keeping them up-to-date are TBD.

srinivas212 avatar May 22 '24 02:05 srinivas212

Understood.

I'm currently in need of some traces for transformers and LLAMA involving tens of nodes.

Once again, I really appreciate your outstanding work!

liecn avatar May 22 '24 02:05 liecn

is there any available trace now?

tyn513 avatar Jun 18 '24 08:06 tyn513

If more multi-node traces can be opened, it will be very helpful to me, thank you!

32HD avatar Sep 05 '24 14:09 32HD