M3FEND icon indicating copy to clipboard operation
M3FEND copied to clipboard

Dataset issues

Open qiljj opened this issue 3 years ago • 12 comments

Hello, I want to ask about the problem of the dataset, after opening the .pkl file, I found that the content in the middle is compressed, I don't know how to deal with it. Moreover, I see that this dataset also contains text, can you upload the dataset you have processed? Thank you very much. 您好,我想请教一下数据集的问题,打开pkl文件后,发现中间的内容都被压缩了,我不知道该怎么处理了。而且,我看这份数据集还包含文字,能否上传一下您处理过的数据集呢?万分感谢。

qiljj avatar Dec 10 '22 13:12 qiljj

现在这个数据集就是处理过的,解压之后能直接跑程序

easezyc avatar Dec 10 '22 13:12 easezyc

数据集问题.pdf 1.那您这个.pkl文件时怎么打开的呢?我用pycharm打开,用UTF-8和GBK都出现乱码问题 2.而且文字怎么传进模型里吖?在模型里跑的不应该都是数字吗? 图2是我想把pkl文件转成txt文件,发现中间内容都被省略了

qiljj avatar Dec 11 '22 02:12 qiljj

1.参考pickle库,代码里也有示范如何用pickle打开 2.代码里有把token转index的部分

easezyc avatar Dec 12 '22 15:12 easezyc

2 1 1.我参考了您的读取pkl文件的代码,图1是提取的代码:用来将pkl文件转换为txt文件,但是得出的txt文件好多内容都被省略了(图2),此处我真不知道如何处理了呜呜呜~ 2.假如说我把pkl文件成功转换为txt文件后,content和style_feature在同一个文档中,是否需要进一步处理?我看有的数据集就只包含经过处理后得到的标签。我没有看到您的代码中有做这种处理的,那我是否需要自己再进行数据处理? 万分感谢

qiljj avatar Jan 07 '23 04:01 qiljj

1.我参考了您的读取pkl文件的代码,图1是提取的代码:用来将pkl文件转换为txt文件,但是得出的txt文件好多内容都被省略了(图2),此处我真不知道如何处理了呜呜呜~ 2.假如说我把pkl文件成功转换为txt文件后,content和style_feature在同一个文档中,是否需要进一步处理?我看有的数据集就只包含经过处理后得到的标签。我没有看到您的代码中有做这种处理的,那我是否需要自己再进行数据处理? 万分感谢

------------------ 原始邮件 ------------------ 发件人: "ICTMCG/M3FEND" @.>; 发送时间: 2022年12月12日(星期一) 晚上11:05 @.>; @.@.>; 主题: Re: [ICTMCG/M3FEND] Dataset issues (Issue #1)

1.参考pickle库,代码里也有示范如何用pickle打开 2.代码里有把token转index的部分

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

qiljj avatar Jan 07 '23 04:01 qiljj

您好,是否方便加个微信呢

------------------ 原始邮件 ------------------ 发件人: "ICTMCG/M3FEND" @.>; 发送时间: 2022年12月12日(星期一) 晚上11:05 @.>; @.@.>; 主题: Re: [ICTMCG/M3FEND] Dataset issues (Issue #1)

1.参考pickle库,代码里也有示范如何用pickle打开 2.代码里有把token转index的部分

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

qiljj avatar Jan 07 '23 04:01 qiljj

简单的做法,读取pkl,遍历每一条样本,转json格式,保存json

easezyc avatar Jan 07 '23 05:01 easezyc

1.我将数据集都转化为了json格式,之前的pkl格式属于中间结果吗(可不回答)?数据集的话最后用的是json格式去训练模型吗? 2../logs/param 下的m3fend_oneloss_param.txt里是空的,但是这里会用到:parser.add_argument('--param_log_dir', default = './logs/param')

qiljj avatar Jan 19 '23 02:01 qiljj

你好,您写的您用的pytorch>1.0 ,方便透漏您具体使用的pytorch版本和显卡型号吗?感谢

qiljj avatar Feb 02 '23 01:02 qiljj

好像是pytorch1.6,显卡V100

easezyc avatar Feb 03 '23 02:02 easezyc

Hi sir i am student from NITPY ,India i am doing my micro project on this paper while running your code some error are coming. can you please help me to solve that error error i mentioned below

lr: 0.0001; model name: m3fend; batchsize: 64; epoch: 50; gpu: 1; domain_num: 3 Traceback (most recent call last): File "main.py", line 112, in Run(config = config).main() File "C:\Users\hashi\OneDrive\Desktop\Project\Fake news\grid_search.py", line 134, in main trainer = M3FENDTrainer(emb_dim = self.emb_dim, mlp_dims = self.mlp_dims, use_cuda = self.use_cuda, TypeError: init() got an unexpected keyword argument 'dataset'

Hashirnihal avatar Dec 16 '23 11:12 Hashirnihal

image politifact 数据集和论文中不一致,pkl有train: 2555,val: 142,test:173

prigioni avatar Jun 26 '24 04:06 prigioni