leileilin

Results 13 comments of


                                            leileilin

pytorch version ? Providing a bool or integral fill value without setting the optional `dtype` or `out` arguments is currently unsupported. In PyTorch 1.7,

> > 训练报错 Providing a bool or integral fill value without setting the optional `dtype` or `out` arguments is currently unsupported. In PyTorch 1.7, > > 什么版本的pytoch可以使用？ > > 可以考虑使用低版本的PyTorch，比如1.4版本，如果方便的话，请告知一下具体报错的代码位置，我们会尽快修复这个bug，谢谢！...

KG-augmented decoding is not applied

> Hello Ye, thanks for making the code public. > > Not sure if I understand correctly, but from [here](https://github.com/yeliu918/KG-BART/blob/master/KGBART/KGBART_model/modeling_kgbart.py#L1076) it seems that the KG-augmented decoding layers are not applied...

transformer-base的输入长度设置

> > 请问您的transformer-base的序列长度是用的默认的1024吗？因为pdc的测试集中，中文的148个篇章中有35个是超过了1024（按子词计算），那么如果采用1024，这35个篇章该怎么处理呢？谢谢 > > 超过1024也没有关系呀，不影响模型。是不是因为您的positional embedding是固定的向量矩阵？我们使用的是三角函数作为positional embedding，因此没有长度的限制。那这样超出1024的输入还会给到模型作为输入吗？谢谢！！！

about the litbank

> Hello! There are some details about the litbank experiment in https://aclanthology.org/2021.emnlp-main.425/ both in the figures and Table 6, but if you want more detailed numbers, I'm happy to provide...

some confusions

> Hello, leileilin I understand electra model pretrain generator and discriminator together. So, loss is sum of generator loss and discriminator loss using each weight: > > ``` > #...

use chinese dataset in ontonotes train the model,the result is not good f1=24

> Hi do you use chinese dataset train your model,can you try it and tell us the result of it. thanks I have a question to ask you. The data...

some confusions

And is torch 1.4.0 sure to support apex? Why would I report an error and not support it?

AttributeError: 'NoneType' object has no attribute 'shape'

when i use sequence_parallel_size，it will also occur.

[Bug]: Can't use yarn rope config for long context in Qwen2 model

> use this way > > ``` > python -m vllm.entrypoints.openai.api_server \ > --model "path to model" \ > --port 8459 \ > --gpu-memory-utilization 0.95 \ > --trust-remote-code \ >...

[Bug]: Can't use yarn rope config for long context in Qwen2 model

> > Have you tried passing `--hf-overrides '{"rope_scaling": {"factor": 4.0, "original_max_position_embeddings": 32768, "type": "yarn"}}'`? > > It is useful to me, I try to add the params directly to the...

1
2
›