icrawler icon indicating copy to clipboard operation
icrawler copied to clipboard

Parser kwargs no accepted

Open ed2050 opened this issue 1 year ago • 1 comments

The Crawler class takes a classname for the parser class, and a dict of parser_args to instantiate the parser:

class Crawler:
    def __init__(
        ...
        parser_cls=Parser,
        extra_parser_args=None,
        ...
    ):
        ...
        parser_kwargs = {} if extra_parser_args is None else extra_parser_args
        self.parser = parser_cls(parser_threads, self.signal, self.session, **parser_kwargs)

Yet the parser class doens't accept any kwargs. It's constructor is:

class Parser(ThreadPool):
    def __init__(self, thread_num, signal, session):

Why is this the case? It causes problems when passing extra kwargs. For example

ed2050 avatar Jun 05 '24 17:06 ed2050

Please let me know if https://github.com/hellock/icrawler/commit/42730647da9b80f02562603137e346050dd5fa4e can fix your issue

ZhiyuanChen avatar Jul 29 '24 13:07 ZhiyuanChen