Question

我在处理在两个不同函数中产生的字典时遇到麻烦（在for循环中），在我的情况下，我在函数中创建字典，有时将其产生给两个函数，function1填充4个键（然后写入csv），function2用6个键填充（4个键与功能1相同，然后写入csv），但是，这4个键的值始终是正确的，但是多余的2个键将被复制到功能1（每当产生功能2时）。当它们单独运行时，它们都可以正确运行。

该代码是使用scrapy crawl test -o test.csv的刮板蜘蛛。我通过在function1中插入两个缺失的键（将它们填充为空）来修复它。我的问题是为什么会发生这种重叠？

显示相似行为的小函数（希望Google提供与我相同的结果）

# -*- coding: utf-8 -*-
from scrapy.spiders import CrawlSpider, Rule
from scrapy.http.request import Request
from urllib.parse import quote,urlparse

class Stackoverflow(CrawlSpider):

    name = 'test'
    start_urls=['https://www.google.com/search?q=SQL',
        'https://www.google.com/search?q=hello+world']

    def parse(self,response):
        item=dict()
        item['hello world'] = 123
        links= response.xpath('//div[@class="r"]/a')
        temp_list = []
        for link in links:
            small_dict=dict()
            if "wikipedia.org" in  urlparse(link.xpath('@href').extract_first()).netloc:
                small_dict['wikipedia'] = link.xpath('@href').extract_first()
                temp_list.append(small_dict)
            if "learnpython.org" in urlparse(link.xpath('@href').extract_first()).netloc:
                small_dict['learnpython'] = link.xpath('@href').extract_first()
                temp_list.append(small_dict)
        for my_dict in temp_list:
            for func,link in my_dict.items():
                yield Request(link,callback=getattr(self, f"{func}"),meta={'item':item})

    def wikipedia(self,response):
        item = response.meta['item']
        item['title']=response.xpath('//title/text()').extract_first()
        image = response.xpath('//a[@class="image"]/img/@src').extract_first()
        item['image_title'] = response.urljoin(image)
        yield item

    def learnpython(self,response):
        item = response.meta['item']
        item['title']=response.xpath('//title/text()').extract_first()
        yield item

python如何处理产生的字典

0 个答案: