haipproxy icon indicating copy to clipboard operation
haipproxy copied to clipboard

:sparkling_heart: High available distributed ip proxy pool, powerd by Scrapy and Redis

Results 49 haipproxy issues
Sort by recently updated
recently updated
newest added

Bumps [scrapy](https://github.com/scrapy/scrapy) from 1.5.0 to 2.6.2. Release notes Sourced from scrapy's releases. 2.6.2 Fixes a security issue around HTTP proxy usage, and addresses a few regressions introduced in Scrapy 2.6.0....

dependencies

Bumps [twisted](https://github.com/twisted/twisted) from 17.9.0 to 22.4.0. Release notes Sourced from twisted's releases. Twisted 22.4.0 (2022-04-11) Features twisted.python.failure.Failure tracebacks now capture module information, improving compatibility with the Raven Sentry client. (#7796)...

dependencies

=> [2/7] RUN echo -e "https://mirrors.tuna.tsinghua.edu.cn/alpine/v3.7/main/\nhttps://mirrors.tuna.tsinghua.edu. 1.6s => ERROR [3/7] RUN apk upgrade --no-cache && apk add --no-cache squid libxml2-dev libxml2 libxslt-dev 0.9s ------ > [3/7] RUN apk upgrade --no-cache...

Bumps [scrapy-splash](https://github.com/scrapy-plugins/scrapy-splash) from 0.7.2 to 0.8.0. Release notes Sourced from scrapy-splash's releases. 0.8.0 Security bug fix: If you use HttpAuthMiddleware (i.e. the http_user and http_pass spider attributes) for Splash authentication,...

dependencies

我运行的是这4条代码,有可以获得IP,但用python客户端调用没办法取出来 - 启动*scrapy worker*,包括代理IP采集器和校验器 > python crawler_booter.py --usage crawler > python crawler_booter.py --usage validator - 启动*调度器*,包括代理IP定时调度和校验 > python scheduler_booter.py --usage crawler > python scheduler_booter.py --usage validator ![1](https://user-images.githubusercontent.com/39333590/51961827-01c82180-2499-11e9-84d1-f0fc98bdf187.png) ![2](https://user-images.githubusercontent.com/39333590/51961828-0260b800-2499-11e9-9fdb-b8a07d8dd9bb.png) redis中是有数据的,但调用提供的方法返回空,感觉是这里有点问题,我太菜了求大佬指点 ![3](https://user-images.githubusercontent.com/39333590/51961838-0987c600-2499-11e9-8cc7-3e5757383b8a.png)...

使用python客户端调用示例: ```python from client.py_cli import ProxyFetcher args = dict(host='127.0.0.1', port=6379, password='123456', db=0) # 这里`zhihu`的意思是,去和`zhihu`相关的代理ip校验队列中获取ip # 这么做的原因是同一个代理IP对不同网站代理效果不同 fetcher = ProxyFetcher('zhihu', strategy='greedy', redis_args=args) # 获取一个可用代理 print(fetcher.get_proxy()) # 获取可用代理列表 print(fetcher.get_proxies()) # or print(fetcher.pool)...

fix exception occurred in installing cryptography reported in #116.

=============================DEBUG ASSISTANCE============================= If you are seeing a compilation error please try the following steps to successfully install cryptography: 1) Upgrade to the latest pip and try again. This will fix...

1. 建议在Dockerfile文件中加入更新pip的命令:`python3 -m pip install --upgrade pip` 修改后的文件如下: ```Dockerfile FROM centos:8 MAINTAINER ResolveWang ENV LC_ALL C.UTF-8 ENV LANG C.UTF-8 RUN yum install squid -yq RUN sed -i 's/http_access deny all/http_access...

rule.py > validator scheduler will fetch tasks from resource queue and store into task queue https://github.com/SpiderClub/haipproxy/blob/master/docs/%E9%85%8D%E7%BD%AE%E6%96%87%E4%BB%B6%E5%8F%82%E6%95%B0%E5%92%8C%E6%84%8F%E4%B9%89.md > 校验器将从task_queue中获取代理IP,校验后存入resource,具体流程见 架构篇 哪个写反了?应该是temp -> validated吧