Zichun Yu comments

Results 19 comments of


                                            Zichun Yu

Question regarding Shuffling

Hi, @LeoXinhaoLee I am also curious about it. Are there any conclusions?

`litdata.optimize` accidentally deletes files from the local filesystem

Hi @tchaton is there a way to set DATA_OPTIMIZER_CACHE_FOLDER in the python script rather than as an environment variable? I didn't find such an interface. Thank you!

Code Generation Evals should parse code from LM response

Hi @manishshettym, thanks for your insightful comments. We will try to parse the code better / examine better prompts. As we evaluate the code generation in a zero-shot fashion, we...

Code Generation Evals should parse code from LM response

Perfect! Thanks, @Naman-ntc I will try it.

Code Generation Evals should parse code from LM response

Hi @DeepCreeper, Yes, we need to add both system and user prompts when we call the API for optimal performance. You can find more information in this [repo](https://github.com/theoxo/self-repair) (and also...

Cannot reproduce the results shown in Github repo with the 120M reference model on A800 (8*80G).

Same issue. I use exactly the weights they provided in `pile_doremi_r1_120M_ref:pile_baseline_50kvocab_nopack_120M.json`. I found squad acc drops a lot in the main model when training goes on. BTW, I only run...

Cannot reproduce the results shown in Github repo with the 120M reference model on A800 (8*80G).

@kiseliu Thanks for your information. I would like to discuss the experimental configurations in more detail. Could you see the email I sent you (gmail)?

Cannot reproduce the results shown in Github repo with the 120M reference model on A800 (8*80G).

Sure. This is my wandb report: https://api.wandb.ai/links/zhiyuan-chenyan-zhenghao-group/cfo97a5p The squad is the most unstable one.

Cannot reproduce the results shown in Github repo with the 120M reference model on A800 (8*80G).

Also, it seems that the avg_acc of the baseline model easily achieves over 6 in both my and @kiseliu 's experiments while in @sangmichaelxie 's report, the highest checkpoint is...

Cannot reproduce the results shown in Github repo with the 120M reference model on A800 (8*80G).

Thank you so much! I will try it out.