geolvr issues

Results 11 issues of


                                            geolvr

Minimizing this loss function will minimize or maximize mutual information?

I'm confused about this. NCEloss can determine the lower bound of mutual information. In this implementation, should NCEloss be minimized in order to increase mutual information?

能否提供一个BERT文本分类使用multigpu的例子？

提问时请尽可能提供如下信息： ### 基本信息 - 你使用的**操作系统**: Linux - 你使用的**Python**版本: 3.7.1 - 你使用的**Tensorflow**版本: 1.14.0 - 你使用的**Keras**版本: 2.3.1 - 你使用的**bert4keras**版本: - 你使用纯**keras**还是**tf.keras**: - 你加载的**预训练模型**:roformer, bert ### 核心代码观察到在task_seq2seq_autotitle.py中，Datagenerator yield的是[batch_token_ids, batch_segment_ids], None。而其多GPU版本 task_seq2seq_autotitle_multigpu.py中，Datagenerator yield的是token_ids,...

如何加载google原版bert代码finetune后的权重

提问时请尽可能提供如下信息： ### 基本信息 - 你使用的**操作系统**: linux - 你使用的**Python**版本: 3.7 - 你使用的**Tensorflow**版本:1.14.0 - 你使用的**Keras**版本: 2.3.1 - 你使用的**bert4keras**版本: - 你使用纯**keras**还是**tf.keras**: tf.keras - 你加载的**预训练模型**:bert ### 核心代码 ```python config_path = '../chinese_L-12_H-768_A-12/bert_config.json' checkpoint_path = '..chinese_L-12_H-768_A-12/bert_model.ckpt'...

请问会支持RoFormer的预训练吗？

### 自我尝试在https://github.com/ZhuiyiTechnology/roformer/blob/main/train.py 看到预训练的例子。但只有MLM的实现，且是词粒度的。另一方面，在本repo中指明了预训练只支持Roberta和GPT方式。而在models.py中看到RoFormer的实现是基于NEZHA的，NEZHA又继承了BERT。比较迷惑现有的实现究竟是否支持RoFormer预训练。具体而言，如果想在自己的数据上，做字粒度（比如从chinese_roformer-char_L-12_H-768_A-12开始，或从头开始）预训练，能否实现呢？

geolvr

Minimizing this loss function will minimize or maximize mutual information?

能否提供一个BERT文本分类使用multigpu的例子？

如何加载google原版bert代码finetune后的权重

请问会支持RoFormer的预训练吗？

当batchsize设为4或8时报错

what's the difference between your self-defined layer "My_Dot" and "Dense" layer in Keras?

可以在领域数据集上，做无监督的增量微调吗？

原版的DailyDialog有13118个对话，与zhdd的对应关系是怎样的？

希望可以写上pytorch版本号

用Langchain的agent可以实现类似的流程吗？