python - Is there any method to using seperate scrapy pipeline for each spider? -
i wanna fetch web pages under different domain, means have use different spider under command "scrapy crawl myspider". however, have use different pipeline logic put data database since content of web pages different. every spider, have go through of pipelines defined in settings.py. there have other elegant method using seperate pipelines each spider?
item_pipelines
setting defined globally spiders in project during engine start. cannot changed per spider on fly.
here options consider:
change code of pipelines. skip/continue processing items returned spiders in
process_item
method of pipeline, e.g.:def process_item(self, item, spider): if spider.name not in ['spider1', 'spider2']: return item # process item
change way start crawling. from script, based on spider name passed parameter, override
item_pipelines
setting before callingcrawler.configure()
.
see also:
- scrapy. how change spider settings after start crawling?
- can use spider-specific settings?
- using 1 scrapy spider several websites
- related answer
hope helps.
Comments
Post a Comment