lilisierrayu

Results 5 comments of lilisierrayu

@klshuster One clarification question on Step 0. With consolidate_fsdp_shards.py, the default will produce model with mp=1, do I need to pass argument of '--new-arch-name transformer_lm_gpt' ? Or it is fine...

> Are all the file renames working with fb_sweep in metaseq_internal? The only impacted fb_sweep scripts are launch_api_helper.py. Training should be fine. Metaseq main already have the rename, so if...

I would also want to add "--fp16", "--memory_efficient_fp16" (and later on '"--bf16") in constant.py to keep model intialization the same as training. But it is not needed for current fix.

> Given the current complicated situation outside research community, we refrain from disclosing more details about data. Nevertheless, researchers may take a look at that dataset project everyone know. @lllyasviel...

@lllyasviel Could you please kindly share the details.