MediaCrawler
MediaCrawler copied to clipboard
知乎爬虫 Request error: {"error":{"code":101,"name":"AuthenticationError","message":"ZERR_NOT_LOGIN"}}
对于这个错误尚不清楚如何解决,还大佬指导一下~
2024-12-09 09:11:10 MediaCrawler INFO (core.py:289) - [ZhihuCrawler.launch_browser] Begin create browser context ...
2024-12-09 09:11:12 MediaCrawler INFO (core.py:258) - [ZhihuCrawler.create_zhihu_client] Begin create zhihu API client ...
2024-12-09 09:11:12 MediaCrawler INFO (client.py:136) - [ZhiHuClient.pong] Begin to pong zhihu...
2024-12-09 09:11:12 MediaCrawler ERROR (client.py:90) - [ZhiHuClient.request] Requset Url: https://www.zhihu.com/api/v4/me?include=email%2Cis_active%2Cis_bind_phone, Request error: {"error":{"code":100,"name":"AuthenticationInvalidRequest","message":"请求头或参数
封装错误"}}
2024-12-09 09:11:13 MediaCrawler ERROR (client.py:90) - [ZhiHuClient.request] Requset Url: https://www.zhihu.com/api/v4/me?include=email%2Cis_active%2Cis_bind_phone, Request error: {"error":{"code":100,"name":"AuthenticationInvalidRequest","message":"请求头或参数
封装错误"}}
2024-12-09 09:11:15 MediaCrawler ERROR (client.py:90) - [ZhiHuClient.request] Requset Url: https://www.zhihu.com/api/v4/me?include=email%2Cis_active%2Cis_bind_phone, Request error: {"error":{"code":100,"name":"AuthenticationInvalidRequest","message":"请求头或参数
封装错误"}}
2024-12-09 09:11:15 MediaCrawler ERROR (client.py:146) - [ZhiHuClient.pong] Ping zhihu failed: RetryError[<Future at 0x7f9f24a9ad90 state=finished raised DataFetchError>], and try to login again...
2024-12-09 09:11:15 MediaCrawler INFO (login.py:58) - [ZhiHu.begin] Begin login zhihu ...
2024-12-09 09:11:15 MediaCrawler INFO (login.py:108) - [ZhiHu.login_by_cookies] Begin login zhihu by cookie ...
2024-12-09 09:11:15 MediaCrawler INFO (core.py:89) - [ZhihuCrawler.start] Zhihu跳转到搜索页面获取搜索页面的Cookies,该过程需要5秒左右
2024-12-09 09:11:21 MediaCrawler INFO (core.py:111) - [ZhihuCrawler.search] Begin search zhihu keywords
2024-12-09 09:11:21 MediaCrawler INFO (core.py:118) - [ZhihuCrawler.search] Current search keyword: 汽车
2024-12-09 09:11:21 MediaCrawler INFO (core.py:127) - [ZhihuCrawler.search] search zhihu keyword: 汽车, page: 1
2024-12-09 09:11:22 MediaCrawler INFO (client.py:212) - [ZhiHuClient.get_note_by_keyword] Search result: {'paging': {'is_end': False, 'next': 'https://api.zhihu.com/search_v3?advert_count=0&correction=1&filter_fields=&gk_version=gz-gaokao&lc_idx=0&limit=20&offset
=20&q=%E6%B1%BD%E8%BD%A6&search_hash_id=6cd8c97771a7c7d75956db132dbae18d&search_source=Filter&show_all_topics=0&sort=&t=general&time_interval=&vertical=&vertical_info=0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C2%2C0'}, 'data': [{'type': 'search_result', 'highlight': {'d
······
stion_id': '562563592', 'title': '电动汽车为什么一下爆发了?', 'desc': '人类早就有能力造电动车了啊,是什么契机,技术进步这两年爆发的?', 'created_time': 1709466282, 'updated_time': 1710330624, 'voteup_count': 15420, 'comment_count': 2457, 'source_keyword': '汽车', 'user_id': '51c6b8c755f776354a3e966e881bbbda', 'user_link': 'https://www.zhihu.com/people/di-ren-jie-57', 'user_nickname': '狄仁杰', 'user_avatar': 'https://pic1.zhimg.com/50/v2-2df28424e6e8e451a5454c091bf6c0ae_l.jpg?source=4e949a73', 'user_url_token': 'di-ren-jie-57', 'last_modify_ts': 1733735482784}
2024-12-09 09:11:22 MediaCrawler INFO (core.py:168) - [ZhihuCrawler.batch_get_content_comments] Crawling comment mode is not enabled
2024-12-09 09:11:22 MediaCrawler INFO (core.py:127) - [ZhihuCrawler.search] search zhihu keyword: 汽车, page: 2
2024-12-09 09:11:23 MediaCrawler ERROR (client.py:90) - [ZhiHuClient.request] Requset Url: https://www.zhihu.com/api/v4/search_v3?gk_version=gz-gaokao&t=general&q=%E6%B1%BD%E8%BD%A6&correction=1&offset=20&limit=20&filter_fields=&lc_idx=20&show_all_topics=0&search_source=Filter&time_interval=&sort=&vertical=, Request error: {"error":{"code":101,"name":"AuthenticationError","message":"ZERR_NOT_LOGIN"}}
2024-12-09 09:11:24 MediaCrawler ERROR (client.py:90) - [ZhiHuClient.request] Requset Url: https://www.zhihu.com/api/v4/search_v3?gk_version=gz-gaokao&t=general&q=%E6%B1%BD%E8%BD%A6&correction=1&offset=20&limit=20&filter_fields=&lc_idx=20&show_all_topics=0&search_source=Filter&time_interval=&sort=&vertical=, Request error: {"error":{"code":101,"name":"AuthenticationError","message":"ZERR_NOT_LOGIN"}}
2024-12-09 09:11:25 MediaCrawler ERROR (client.py:90) - [ZhiHuClient.request] Requset Url: https://www.zhihu.com/api/v4/search_v3?gk_version=gz-gaokao&t=general&q=%E6%B1%BD%E8%BD%A6&correction=1&offset=20&limit=20&filter_fields=&lc_idx=20&show_all_topics=0&search_source=Filter&time_interval=&sort=&vertical=, Request error: {"error":{"code":101,"name":"AuthenticationError","message":"ZERR_NOT_LOGIN"}}
而且这一错误抛出的是tenacity.RetryError,无法被DataFetchError捕获 https://github.com/NanmiCoder/MediaCrawler/blob/dc9116e098cb1daddce369ced30db9edbbc361b7/media_platform/zhihu/core.py#L141
Traceback (most recent call last):
File "/opt/conda/envs/media/lib/python3.9/site-packages/tenacity/_asyncio.py", line 50, in __call__
result = await fn(*args, **kwargs)
File "/data/media_platform/zhihu/client.py", line 96, in request
raise DataFetchError(response.text)
media_platform.zhihu.exception.DataFetchError: {"error":{"code":101,"name":"AuthenticationError","message":"ZERR_NOT_LOGIN"}}
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/main.py", line 66, in <module>
asyncio.get_event_loop().run_until_complete(main())
File "/opt/conda/envs/media/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
return future.result()
File "/data/main.py", line 56, in main
await crawler.start()
File "/data/media_platform/zhihu/core.py", line 96, in start
await self.search()
File "/data/media_platform/zhihu/core.py", line 127, in search
content_list: List[ZhihuContent] = await self.zhihu_client.get_note_by_keyword(
File "/data/media_platform/zhihu/client.py", line 211, in get_note_by_keyword
search_res = await self.get(uri, params)
File "/data/media_platform/zhihu/client.py", line 128, in get
return await self.request(method="GET", url=zhihu_constant.ZHIHU_URL + final_uri, headers=headers, **kwargs)
File "/opt/conda/envs/media/lib/python3.9/site-packages/tenacity/_asyncio.py", line 88, in async_wrapped
return await fn(*args, **kwargs)
File "/opt/conda/envs/media/lib/python3.9/site-packages/tenacity/_asyncio.py", line 47, in __call__
do = self.iter(retry_state=retry_state)
File "/opt/conda/envs/media/lib/python3.9/site-packages/tenacity/__init__.py", line 326, in iter
raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x7f2fadc7fbb0 state=finished raised DataFetchError>]
这是知乎问答吗 ?
知乎,关键词搜索,不太清楚跟问答有没有关系
知乎,关键词搜索,不太清楚跟问答有没有关系
你使用成功了吗,我怎么一直报错