core
core copied to clipboard
Add possibility to configure Run with entire Downloader, ItemPipeline and Processor instances
The purpose of this PR is to create re-usable configurations for Downloader, ItemPipeline and Processor objects.
I use roach with Symfony bundle https://github.com/Ne-Lexa/roach-php-bundle and now I have to duplicate pipeline and middleware configurations for different spiders in my application. I think, that my proposal can avoid duplicating this:
roach.run.first:
parent: roach.run.base
arguments:
$downloaderMiddleware:
- '@roach.downloader_middleware.first'
- '@roach.downloader_middleware.second'
$itemProcessors:
- '@roach.item_processor.first'
- '@roach.item_processor.second'
roach.run.second:
parent: roach.run.base
arguments:
$downloaderMiddleware:
- '@roach.downloader_middleware.first'
- '@roach.downloader_middleware.second'
$itemProcessors:
- '@roach.item_processor.first'
- '@roach.item_processor.second'
and allows this:
roach.downloader: ~
roach.item_pipeline: ~
roach.run.first:
~
arguments:
$downloader: '@roach.downloader'
$itemPipepine: '@roach.item_pipeline'
roach.run.second:
~
arguments:
$downloader: '@roach.downloader'
$itemPipepine: '@roach.item_pipeline'
To avoid BC-break I have extended Run fields and add a possibility to configure the Engine both with an array of middlewares/processors and an entire instances of Downloader/ItemPipeline/Processor.