maxtext
maxtext copied to clipboard
Explicitly pass in weight dtype and activation dtype for serving offline script.
…ine run.
Description
For the offline serving engine, the weight dtype is hardcoded to bf16 which blocks the other dtype experiments. This PR pass in the weight dtype and activation dtype explicitly to unblock the future experiments.
The rest of the description includes relevant details and context, examples:
- why is this change being made,
- the problem being solved and any relevant context,
- why this is a good solution,
- some information about the specific implementation,
- shortcomings of the solution and possible future improvements.
Tests
Manual run test.
Checklist
Before submitting this PR, please make sure (put X in square brackets):
- [x] I have performed a self-review of my code.
- [x] I have necessary comments in my code, particularly in hard-to-understand areas.
- [x] I have run end-to-end tests tests and provided workload links above if applicable.
- [x] I have made or will make corresponding changes to the doc if needed.