grab issues

docs: Fix a few typos

There are small typos in: - docs/grab/transport.rst - docs/spider/intro.rst - docs/spider/task.rst - docs/spider/transport.rst - docs/usage/installation.rst - grab/base.py - grab/spider/queue_backend/redis.py Fixes: - Should read `access` rather than `acess`. - Should read...

timgates42

Fix grammer/spelling errors in README.md

fixes #396 396

AMetIR

Fix grammar/spelling errors in README.md

AMetIR

Wrong Thread method for Python 3.9.0+

``` /lib/python3.10/site-packages/grab/spider/base_service.py", line 64, in is_alive return self.thread.isAlive() AttributeError: 'Thread' object has no attribute 'isAlive'. Did you mean: 'is_alive'? ``` Python 3.10, but I saw same problem in 3.9

notzeldon

grab/pycurl fails to parse cookies

2

Content of `grab.response.head in the moment of error happened: ``` b'HTTP/1.1 200 OK\r\nDate: Tue, 13 Jun 2017 22:16:36 GMT\r\nServer: Apache\r\nSet-Cookie: \xb3\xd2\xda\xcd\xd7=%96%A6g%9Ay%B0%A5g%A7tm%7C%95%9A; expires=Tue, 25-Jul-2017 14:16:36 GMT; path=/\r\nX-Powered-By: Apache2\r\nVary: Accept-Encoding\r\nContent-Encoding: gzip\r\nContent-Length: 4974\r\nContent-Type:...

lorien

bug

cookies

Deprecation warning for defusedxml.lxml

Affected file: [grab/document.py](https://github.com/lorien/grab/blob/master/grab/document.py#L38) ``` >>> import libgenapi ... /usr/local/lib/python3.9/site-packages/grab/document.py:35: DeprecationWarning: defusedxml.lxml is no longer supported and will be removed in a future release. import defusedxml.lxml ``` The defusedxml.lxml subpackage will...

opensource-assist

bug

spider: impossible to setup grab transport

2

I want to configure CURLOPT_RESOLVE to specific IP address, so in create_grab_instance() I wrote: ``` ... g.setup_transport('pycurl') g.transport.curl.setopt(pycurl.RESOLVE, ['api.somesite.com:443:{}'.format(ip)]) return g ``` When I call spider.run(), I get the following...

ingvarbiz

bug

Sending empty header filed

4

Sometimes there is a real need to send header filed with empty value. Example [here](https://curl.haxx.se/libcurl/c/httpcustomheader.html) explains how to do that.

felyxjet

bug

Failed pycurl/resolve/cookies test

It works on my dev Debian machine. It fails in github ubunti CI environemnt. ```python @only_grab_transport("pycurl") def test_different_domains(self): import pycurl # pylint: disable=import-outside-toplevel grab = build_grab() names = [ "foo:%d:127.0.0.1"...

lorien

bug

cookies

domain name resolving

Update documentation url

The documentation URL redirects to the old page. Now it will redirect to the new one

patamimbre

grab
grab copied to clipboard

Metadata

docs: Fix a few typos

Fix grammer/spelling errors in README.md

Fix grammar/spelling errors in README.md

Wrong Thread method for Python 3.9.0+

grab/pycurl fails to parse cookies

Deprecation warning for defusedxml.lxml

spider: impossible to setup grab transport

Sending empty header filed

Failed pycurl/resolve/cookies test

Update documentation url

← Metadata

Owner

Metadata

grab grab copied to clipboard

Metadata

← Metadata

Owner

Metadata

grab
grab copied to clipboard