python - Is there any method to using seperate scrapy pipeline for each spider? -

- May 15, 2010

i wanna fetch web pages under different domain, means have use different spider under command "scrapy crawl myspider". however, have use different pipeline logic put data database since content of web pages different. every spider, have go through of pipelines defined in settings.py. there have other elegant method using seperate pipelines each spider?

item_pipelines setting defined globally spiders in project during engine start. cannot changed per spider on fly.

here options consider:

change code of pipelines. skip/continue processing items returned spiders in process_item method of pipeline, e.g.:

def process_item(self, item, spider):      if spider.name not in ['spider1', 'spider2']:          return item        # process item

change way start crawling. from script, based on spider name passed parameter, override item_pipelines setting before calling crawler.configure().

Search This Blog

EEE

python - Is there any method to using seperate scrapy pipeline for each spider? -

Comments

Post a Comment

Popular posts from this blog

Ansible - ERROR! the field 'hosts' is required but was not set -

customize file_field button ruby on rails -

SoapUI on windows 10 - high DPI/4K scaling issue -