cx-extractor-python
cx-extractor-python copied to clipboard

chrislinan

→

Metadata

基于行块分布函数的通用网页正文抽取算法的Python版本实现，添加了英文支持/ Web page content extraction algorithm, support both Chinese and English

Reame
Issues

Results 2 cx-extractor-python issues

Sort by recently updated

print 和 write输出结果都有问题

``` Traceback (most recent call last): File "testEnglish.py", line 11, in textfile.write(text) TypeError: write() argument must be str, not coroutine ``` print则会显示内存地址，而不是文本不管是在testEnglish.py还是我自己写的脚本中都有这个问题多线程小白不知道怎么处理提前谢谢

BJdeBordeaux

大佬问个小问题

每篇网页是不是应该总共有LinesNum(content)-K+1个block？如果是LinesNum(content)-K的话，最后一块就没有加入判断

zifengcoder

About

基于行块分布函数的通用网页正文抽取算法的Python版本实现，添加了英文支持/ Web page content extraction algorithm, support both Chinese and English

484

Stars

111

Forks

Watchers

Owner

chrislinan

← Metadata

484

Stars

111

Forks

Watchers

Owner

chrislinan

Metadata

基于行块分布函数的通用网页正文抽取算法的Python版本实现，添加了英文支持/ Web page content extraction algorithm, support both Chinese and English

Back

cx-extractor-python cx-extractor-python copied to clipboard

Metadata

print 和 write输出结果都有问题

大佬问个小问题

← Metadata

Owner

Metadata

cx-extractor-python
cx-extractor-python copied to clipboard