从刮db蜘蛛mongodb

时间:2019-06-09 16:35:29

标签: python mongodb scrapy

我写了一个简单的脚本来测试Mongo数据库,

import scrapy
from mango.items import MangoItem
class Quote(scrapy.Spider):

    name = "Quote"

    def start_requests(self):
        urls= ['http://quotes.toscrape.com/']
        for url in urls:
            yield scrapy.Request(url=url,callback=self.parse)

    def parse(self,response):
        item=MangoItem()
        rows = response.xpath('//div[@class="quote"]')
        for row in rows:

            item['quote'] = row.xpath('span/text()').extract_first()
            item['author'] = row.xpath('span[2]/small/text()').extract_first()
            item['tags'] = row.xpath('div[@class="tags"]/meta/@content').extract_first()
            yield item

这是我的pipline.py

import pymongo

class MangoPipeline(object):


    def __init__(self):
        self.conn = pymongo.MongoClient(
            'localhost',27017
            )
        db=self.conn['myquotes'] #create db
        self.collection = db['Quotes']#create table or adds to it if exist


    def process_item(self, item, spider):
        self.collection.insert(item)
        return item

抓取的项目正确显示在终端中,并且没有出现任何错误,但是,在Mongo shell中没有创建数据库。

0 个答案:

没有答案