maxtext icon indicating copy to clipboard operation
maxtext copied to clipboard

Skip checkpointing at step=0

Open khatwanimohit opened this issue 10 months ago • 2 comments

Description

  • Skip checkpointing at step=0
  • add abs for max numerical diff log in forward_pass_checker

Tests

Integration tests

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • [x] I have performed a self-review of my code.
  • [x] I have necessary comments in my code, particularly in hard-to-understand areas.
  • [x] I have run end-to-end tests tests and provided workload links above if applicable.
  • [x] I have made or will make corresponding changes to the doc if needed.

khatwanimohit avatar Mar 18 '25 00:03 khatwanimohit

@xuefgu originally added the checkpoint at step 0, do you have strong opinions about removing it?

gobbleturk avatar Mar 18 '25 01:03 gobbleturk

@xuefgu originally added the checkpoint at step 0, do you have strong opinions about removing it?

No objections. My ancient change, if memory serves, was only to avoid calling save_checkpoint when step was not a multiple of the interval.

Two nits on the PR though:

  1. We could place the step != 0 check in the front.
  2. Line 136 could benefit from the same check.

xuefgu avatar Mar 18 '25 04:03 xuefgu

This PR has been automatically marked as stale because it has not had recent activity. It will be closed soon if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Dec 02 '25 16:12 github-actions[bot]