maxtext icon indicating copy to clipboard operation
maxtext copied to clipboard

Adds a toy llama2 model

Open wang2yn84 opened this issue 9 months ago • 2 comments

Description

The smallest llama model we support is 7b, which is still too slow for local development. This PR creates a toy model with much smaller For functional testing and non accuracy validation only.

Tests

export XLA_FLAGS="--xla_gpu_enable_latency_hiding_scheduler=true --xla_gpu_enable_command_buffer=FUSION --xla_disable_hlo_passes=rematerialization --xla_disable_hlo_passes=gpu-convert-async-collectives-to-sync" # flags from NVidia export TF_FORCE_GPU_ALLOW_GROWTH=true export BASE_OUTPUT_DIRECTORY=/scratch/temp export ASYNC_CHECKPOINTING=false export XLA_PYTHON_CLIENT_MEM_FRACTION=0.92 export PER_DEVICE_BATCH_SIZE=2

python3 MaxText/decode.py MaxText/configs/base.yml base_output_directory=${BASE_OUTPUT_DIRECTORY} model_name='llama2-toy' max_prefill_predict_length=1024 max_target_length=2048 attention=dot_product scan_layers=false hardware=gpu async_checkpointing=${ASYNC_CHECKPOINTING} per_device_batch_size=${PER_DEVICE_BATCH_SIZE} run_name=$(date +%Y-%m-%d-%H-%M) ici_fsdp_parallelism=1 ici_autoregressive_parallelism=1 ici_tensor_parallelism=-1 skip_jax_distributed_system=True weight_dtype=float16 dtype=float16 kv_quant_dtype=fp8 quantize_kvcache=True

Finish less than 1 minute.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • [x] I have performed a self-review of my code.
  • [x] I have necessary comments in my code, particularly in hard-to-understand areas.
  • [x] I have run end-to-end tests tests and provided workload links above if applicable.
  • [x] I have made or will make corresponding changes to the doc if needed.

wang2yn84 avatar Apr 03 '25 21:04 wang2yn84

Curious how this script is useful to you? You can already run this model by setting the dimensions manually (e.g. base_emb_dim=xx). Is this just to avoid setting all of these parameters all of the time?

gobbleturk avatar Apr 03 '25 22:04 gobbleturk

This PR has been automatically marked as stale because it has not had recent activity. It will be closed soon if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Nov 18 '25 16:11 github-actions[bot]

This PR was closed because it has been inactive for a while. Please reopen it if you are still working on it.

github-actions[bot] avatar Nov 26 '25 16:11 github-actions[bot]