Rafi Witten issues

Results 10 issues of


                                            Rafi Witten

python MaxText/decode.py MaxText/configs/base.yml per_device_batch_size=64 run_name=runner_2024-01-30-20-02 max_prefill_predict_length=128 max_target_length=256 dataset_path=gs://maxtext-dataset async_checkpointing=false scan_layers=false attention=dot_product scan_layers=false ici_autoregressive_parallelism=4 400GB/s/device on a v4-8

[NOT FOR MERGE] Test no metrics

[Don't Merge] Multihost Decode Monkey Patch

Debug Strange GPU Bug

The code works fine on TPU, crashes on GPU with a strange error. (Look at the test logs below while running train.py)

[NOT FOR MERGE] Rwitten host offload demo

``` [rwitten@t1v-n-621261c1-w-0 2024-03-19 23:26:33] ~/maxtext (rwitten_shmap_collective_matmul_finalized) python3 pedagogical_examples/host_offload.py F0319 23:26:38.216995 1800494 llo_decomposer.cc:893] Unexpected opcode: dma-vmem-to-host-ram *** Check failure stack trace: *** @ 0x7f2ff084ec24 (unknown) @ 0x7f2ff084e744 (unknown) @ 0x7f2ff084ef89 (unknown)...

pull ready

Print Time More Accurately In MaxText

Make XPK Handle multiple slice sizes

N queues, 1 per slice size, 1 cluster. (This is complicated!)

List of Nits From An Early User

* XPK didn't work on Debian because `awk -e` wasn't supported. * 40 char limit for workloads is constraining, any ideas to fix. * Deleting workloads periodically automatically? * Pausing...

Marrying User Metadata Into Cluster

https://kubernetes.io/docs/concepts/configuration/configmap/ (1) Should `workload create` need the accelerator type? Can we extract it from the XPK cluster? (2) Where else can we simplify? For example, I feel like we should...

Rafi Witten

Full JetEngine Support

Example inference workload