Yanan comments

Results 42 comments of


                                            Yanan

文档错误

https://paddlepedia.readthedocs.io/en/latest/tutorials/deep_learning/distances/distances.html Jaccard Coefficient 错误

Filtering words composed of more than 1 token

> ``` > bow_indices.append( > [tokenizer.encode(word.strip(), > add_prefix_space=True, > add_special_tokens=False) > for word in words]) > ``` > > i try to run this code, and all words are composed...

Test TracIn's effectiveness in text classification

I intentionally fabricate some samples whose labels are incorrect, `df_noise`. It can be regarded as honeypot to test TracIn's ability to score them. It is anticipated that TracIn should give...

Test TracIn's effectiveness in text classification

For the checkpoints picking, currently, I pick those checkpoints which reduce the loss on validation set obviously. I set the total epochs to 50. and checkpoints from early epochs are...

How to liberate the gpt2 from reference model?

> You could set the `init_kl_coeff=0` (see [here](https://github.com/lvwerra/trl/blob/750f5fd5329bb81c79b00243c4c8923ac14981d5/trl/ppo.py#L93)) to liberate the model from the reference completely or increase the KL target `target` (which is 6 by default). Thanks.

How to liberate the gpt2 from reference model?

By the way, do you have investigations on how to tune the txt_in_len, txt_out_len to better sever the topic/sentiment preservation of the generated texts? Currently, I find that fine-tuning the...

How to liberate the gpt2 from reference model?

@lvwerra Hi, I recently find you that you added a simple code demo here https://lvwerra.github.io/trl// where `ppo_config = {'batch_size': 1, 'forward_batch_size': 1}` I suppose this is single sample mode, rather...

How to liberate the gpt2 from reference model?

@lvwerra Thanks. I find that it is so crucial to design a good reward feedback module that can return a reward with positive or negative value. And the reference GPT...

[Bug]: when dtype='bfloat16', batch_size will cause different inference results

vllm version: 0.4.1

[BUG]output tensor must have the same type as input tensor in PPO training script of TRL

I printed the `batch["input_ids"]` and found that they are on cuda:0 ``` tensor([ 1, 995, 460, ..., 3177, 9116, 28747], device='cuda:0') torch.Size([7980]) tensor([ 1, 995, 460, ..., 3177, 9116, 28747],...