Merlin comments

Results 24 comments of


                                            Merlin

Results about common datasets?

> Hello,glad to see your excellent work.I wonder whether you can supply the results at commom datasets like coco or voc? thx for replying. With this code, the experiment at...

Cannot get similar AP on K2C

> I use the same config files to execute Probablistic Teacher on KITTI to Cityscapes (k2c). According to your paper, AP of k2c improve from 40.3 to 60.2. However, I...

the pre-training loss

> The pre-training losses are always negative (like -199.03), is that normal? I got the negative loss also, have you solve it?

Can i add some other models to fauxpilot except codegen series, like gpt-neo.

hello，in my understanding，fauxpilot uses the gptj inerface in triton ft backend due to codegen shares the same arch with gptj. If i want to add a model that has slightly...

streaming模式和非streaming模式下模型指标差异巨大

也就是非 streaming 模式会 shuffle 整个训练数据集，streaming 模式只是在sample data 时候会在buffersize内进行 shuffle？

[BUG] Always get errno: 110 - Connection timed out when using deepspeed multi-node training.

I met the same issue. Have you find a solution? @Luoyang144

While weight conversion of llama-13b getting this error: RuntimeError: Internal: unk is not defined.

> I am using linux ubunto. > > with following virtual environment > > **accelerate==0.18.0 certifi==2022.12.7 charset-normalizer==3.1.0 cmake==3.26.3 filelock==3.12.0 huggingface-hub==0.13.4 idna==3.4 Jinja2==3.1.2 lit==16.0.1 MarkupSafe==2.1.2 mpmath==1.3.0 networkx==3.1 numpy==1.24.2 nvidia-cublas-cu11==11.10.3.66 nvidia-cuda-cupti-cu11==11.7.101 nvidia-cuda-nvrtc-cu11==11.7.99...

[Feature] Support datasets in OpenAI/evals

> ### Describe the feature > [OpenAI/evals](https://github.com/openai/evals/tree/main/evals/registry/data) contains many community-sourced tasks, with about 400-500 datasets covering various languages and fields. Compared to other open-source datasets, they have very few questions,...

No private IP address found on Azure App Service

Same issue, has anybody solved this ?