kostum123
kostum123
Daily limit
Does the daily limit apply when using this?
GPT 3.5 has 4096 token context window. Do you plan to increase the model's context window and output token limit? I am not a expert in this field but this...
It would be extremely helpful if you could share the list of languages that are included in the dataset used to train the model. If the dataset is available on...
### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction Multiturn datasets are supported in the original dpo example. Wouldn't it be appropriate for...
Wouldn't using Google CSE be a more cost effective solution than using the bing search api? Btw Langchain supports Google CSE. Link: https://programmablesearchengine.google.com/controlpanel/all
I think CHATGPT writes as if it is September 2021 because we do not specify the current time in the promt. This problem can be solved with a python script...
### Reminder - [X] I have read the README and searched the existing issues. ### System Info - ### Reproduction - ### Expected behavior Currently, the contamination-free packaging method is...
### Reminder - [X] I have read the README and searched the existing issues. ### System Info Colab a100 40gb latest commit a2bd6944cd85fdca83407c1cb354f61e57e2ac78 ### Reproduction I encountered a RuntimeError while...
Fixes #305 Fix dtype mismatch in fused_linear_cross_entropy_forward function. * Cast `logits_chunk` to the data type of `_input_chunk` before performing operations on it. --- I tested this in Colab after the...
### 🐛 Describe the bug I encountered a RuntimeError while running a full fine-tuning experiment using the LLaMA-Factory on a model with BFloat16 precision. The error occurred during the training...