core icon indicating copy to clipboard operation
core copied to clipboard

Add possibility to configure Run with entire Downloader, ItemPipeline and Processor instances

Open pavlokomarov opened this issue 3 years ago • 0 comments

The purpose of this PR is to create re-usable configurations for Downloader, ItemPipeline and Processor objects.

I use roach with Symfony bundle https://github.com/Ne-Lexa/roach-php-bundle and now I have to duplicate pipeline and middleware configurations for different spiders in my application. I think, that my proposal can avoid duplicating this:

    roach.run.first:
        parent: roach.run.base
        arguments:
            $downloaderMiddleware:
                - '@roach.downloader_middleware.first'
                - '@roach.downloader_middleware.second'
            $itemProcessors:
                - '@roach.item_processor.first'
                - '@roach.item_processor.second'
                
    roach.run.second:
        parent: roach.run.base
        arguments:
            $downloaderMiddleware:
                - '@roach.downloader_middleware.first'
                - '@roach.downloader_middleware.second'
            $itemProcessors:
                - '@roach.item_processor.first'
                - '@roach.item_processor.second'

and allows this:

    roach.downloader: ~
    roach.item_pipeline: ~

    roach.run.first:
        ~
        arguments:
            $downloader: '@roach.downloader'
            $itemPipepine: '@roach.item_pipeline'
   roach.run.second:
        ~
        arguments:
            $downloader: '@roach.downloader'
            $itemPipepine: '@roach.item_pipeline'

To avoid BC-break I have extended Run fields and add a possibility to configure the Engine both with an array of middlewares/processors and an entire instances of Downloader/ItemPipeline/Processor.

pavlokomarov avatar Oct 02 '22 10:10 pavlokomarov