Yanan
Yanan
https://paddlepedia.readthedocs.io/en/latest/tutorials/deep_learning/distances/distances.html Jaccard Coefficient 错误
> ``` > bow_indices.append( > [tokenizer.encode(word.strip(), > add_prefix_space=True, > add_special_tokens=False) > for word in words]) > ``` > > i try to run this code, and all words are composed...
I intentionally fabricate some samples whose labels are incorrect, `df_noise`. It can be regarded as honeypot to test TracIn's ability to score them. It is anticipated that TracIn should give...
For the checkpoints picking, currently, I pick those checkpoints which reduce the loss on validation set obviously. I set the total epochs to 50. and checkpoints from early epochs are...
> You could set the `init_kl_coeff=0` (see [here](https://github.com/lvwerra/trl/blob/750f5fd5329bb81c79b00243c4c8923ac14981d5/trl/ppo.py#L93)) to liberate the model from the reference completely or increase the KL target `target` (which is 6 by default). Thanks.
By the way, do you have investigations on how to tune the txt_in_len, txt_out_len to better sever the topic/sentiment preservation of the generated texts? Currently, I find that fine-tuning the...
@lvwerra Hi, I recently find you that you added a simple code demo here https://lvwerra.github.io/trl// where `ppo_config = {'batch_size': 1, 'forward_batch_size': 1}` I suppose this is single sample mode, rather...
@lvwerra Thanks. I find that it is so crucial to design a good reward feedback module that can return a reward with positive or negative value. And the reference GPT...
vllm version: 0.4.1
I printed the `batch["input_ids"]` and found that they are on cuda:0 ``` tensor([ 1, 995, 460, ..., 3177, 9116, 28747], device='cuda:0') torch.Size([7980]) tensor([ 1, 995, 460, ..., 3177, 9116, 28747],...