抓取时缺少信息-可能是由于错误的css.selector

时间:2019-07-14 12:34:56

标签: python web-scraping scrapy css-selectors

目前,我正在尝试使用Python中的Scraping(草率)进行某些操作,但我无法解决此问题(我尝试了很多事情,甚至以前在Stack上提出过一个问题,请参见URL下方)。

我尝试抓取两个websites(它们在我的脚本中),然后收到结果。但是,我缺少信息,我找不到原因。

刮板起作用。但是我无法刮除'Sponsored Tag'(在代码中请参见part:item ['Sponsored_Tag'])。

我的问题:如何像现在一样获得结果,但包括赞助商标签?

我尝试了什么? 我尝试了很多事情。例如,将response.css()更改为“ s-result-list s-search-results”。

我以为我有解决方案。如果您查看其中一个页面,则可以看到(如果您搜索“ s-result-item”,这是我们的response.css),则结果包含一个名为“ AdHolder”的文本。但是,我无法在检索结果中找到它……(请参见下图)

PrintScreen of the s-result-item. You see that there is something which is called AdHolder...

所需结果

一个文件(当前,我正在编写JSON文件),其中包含以下信息:

 - Sponsored Tag: Yes/No           **#This is what is missing!**
 - ASIN: XXXXXXXX                  #This works in the code below
 - Index: "0"                      #This works in the code below 
 - Link: "complete link"           #This works in the code below 
 - url_response: "response link"   #This works in the code below
 - tag: Bestsellertag etc.         #This works in the code below 

我的代码:

from twisted.internet import reactor
import scrapy
from scrapy.crawler import CrawlerRunner
from scrapy.utils.log import configure_logging
#import re

class AmazonProductSpider(scrapy.Spider):
    name = "AmazonDeals"
    allowed_domains = ["amazon.com"]

    DOWNLOADER_MIDDLEWARES = {
    'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware': None,
    'scrapy_fake_useragent.middleware.RandomUserAgentMiddleware': 400,}
    #Use working product URL below
    start_urls = [
            "https://www.amazon.com/s?k=shaver&ref=nb_sb_noss_1",          # Shaver
            "https://www.amazon.com/s?k=electric+shaver&ref=nb_sb_noss_2"]#

custom_settings = {
        'FEED_URI' : 'Asin_Titles.json',
        'FEED_FORMAT' : 'json'
        }

    def parse(self, response):
        for product in response.css('.s-result-item'):   # Do I scrape the wrong information? 
            item = AmazonItem()

            # I think that this part goes wrong (the item['Sponsored_Tag'] part)
            item['Sponsored_Tag'] = product.css('span:contains("Sponsored")') 
            #item['Sponsored_Tag'] = product.css('.s-result-item').get() #.css('contains("sponsored")').get()

            item['Prime_tag'] = product.css('.a-color-secondary').get()
            item['asin'] = product.css('::attr(data-asin)').get()
            item['index'] = product.css('::attr(data-index)').get()
            item['link'] = product.css('.a-text-normal::attr(href)').get() 
            item['url_Response'] = response.url
            item['tag'] = product.css('.a-badge-text').get()
            yield item

class AmazonItem(scrapy.Item):
    asin = scrapy.Field()
    index = scrapy.Field()
    link = scrapy.Field()
    url_Response = scrapy.Field()
    tag = scrapy.Field()
    Prime_tag = scrapy.Field() 
    Sponsored_Tag = scrapy.Field()

编辑1

如ThePyGuy所述,解决方案已集成。不幸的是,(“ adHolder”项目的所有结果均为NULL。

    def parse(self, response):
    item = AmazonItem()

    for result in response.css('.s-result-list div'):
        if result.css('.AdHolder').extract_first():
            item['adholder'] = True
        else:
            item['adholder'] = False

    for product in response.css('.s-result-item'):    #.s-result-item 
        #item = AmazonItem()
        #item['Sponsored_Tag'] = product.css('span:contains("sponsored")').get()
        #item['Sponsored_Tag'] = product.css('.s-result-item').get() #.css('contains("sponsored")').get()

        item['Prime_tag'] = product.css('.a-color-secondary').get()
        item['asin'] = product.css('::attr(data-asin)').get()
        item['index'] = product.css('::attr(data-index)').get()
        item['link'] = product.css('.a-text-normal::attr(href)').get() 
        item['url_Response'] = response.url
        item['tag'] = product.css('.a-badge-text').get()
        # And so on 
        # ...
        yield item

class AmazonItem(scrapy.Item):   
asin = scrapy.Field()
index = scrapy.Field()
link = scrapy.Field()
url_Response = scrapy.Field()
tag = scrapy.Field()
Prime_tag = scrapy.Field() 
#Sponsored_Tag = scrapy.Field()
adholder = scrapy.Field()

编辑2

如ThePyGuy所述,所有内容都在一个循环中。这里有两个问题:

  1. AdHolder(或赞助商代码)没有被抓取(一切都是FALSE,这是不可能的)。
  2. 我们现在有太多产品,+ /-3095,而我期望的是(两页,40/50产品= 80/100产品)

    def parse(self, response):
    
    for result in response.css('.s-result-list div'):
        item = AmazonItem()
    
        if result.css('.AdHolder').extract_first():
            item['adholder'] = True
        else:
            item['adholder'] = False
    
        item['Prime_tag'] = result.css('.a-color-secondary').get()
        item['asin'] = result.css('::attr(data-asin)').get()
        item['index'] = result.css('::attr(data-index)').get()
        item['link'] = result.css('.a-text-normal::attr(href)').get() 
        item['url_Response'] = response.url
        item['tag'] = result.css('.a-badge-text').get()
        # And so on 
        # ...
        yield item
    

在此先感谢您的帮助。

1 个答案:

答案 0 :(得分:0)

In [4]: print(response.css('.s-result-list .AdHolder').extract())                                                                                                                                                                                                              
['<div data-asin="B07F7XYMNN" data-index="0" class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 s-result-item sg-col-4-of-28 sg-col-4-of-16 AdHolder sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n    \n\n\n<div data-component-type="s-impression-logger" data-component-props=\'{"percentageShownToFire":"50","batchable":true,"requiredElementSelector":".s-image","url":"https://www.amazon.com/gp/sponsored-products/logging/log-action.html?qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf&amp;adId=200011353751711&amp;eventType=1&amp;adIndex=0"}\' class="rush-component s-expand-height">\n    \n\n\n<div data-component-type="sp-sponsored-result" class="rush-component s-expand-height">\n    \n\n\n\n\n\n\n\n\n<div class="s-expand-height s-include-content-margin s-border-bottom">\n<div class="a-section a-spacing-medium">\n\n\n<div class="sg-row">\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        <div class="a-section a-spacing-micro s-min-height-extra-large">\n            \n        </div>\n    </div></div>\n</div>\n\n<div class="sg-row">\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        <div class="a-section a-spacing-none">\n            \n\n\n\n\n\n<span data-component-type="s-product-image" class="rush-component">\n    \n    <a class="a-link-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_1?ie=UTF8&amp;adId=A0414635KJLKI9OVJ7G1&amp;url=%2FBraun-Electric-Integrated-Precision-Rechargeable%2Fdp%2FB07F7XYMNN%2Fref%3Dsr_1_1_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-1-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf">\n        <div class="a-section aok-relative s-image-square-aspect">\n            \n                \n                \n                    <img src="https://m.media-amazon.com/images/I/81Y8IzpMY2L._AC_UL320_.jpg" class="s-image" alt="Braun Series 9 Men\'s Electric Foil Shaver with Wet &amp; Dry Integrated Precision Trimmer &amp; Rechargeable and Cordless Razor with Clean&amp;Charge Station, 9296cc" srcset="https://m.media-amazon.com/images/I/81Y8IzpMY2L._AC_UL320_.jpg 1x, https://m.media-amazon.com/images/I/81Y8IzpMY2L._AC_UL480_QL65_.jpg 1.5x, https://m.media-amazon.com/images/I/81Y8IzpMY2L._AC_UL640_QL65_.jpg 2x, https://m.media-amazon.com/images/I/81Y8IzpMY2L._AC_UL800_QL65_.jpg 2.5x, https://m.media-amazon.com/images/I/81Y8IzpMY2L._AC_UL960_QL65_.jpg 3x" data-image-index="0" data-image-load="" data-image-latency="s-product-image" data-image-source-density="1">\n                \n            \n        </div>\n    </a>\n</span>\n\n        </div>\n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        <div class="a-section a-spacing-none a-spacing-top-small">\n            <div class="a-row a-spacing-micro"><span class="a-size-base a-color-secondary">Sponsored</span>\n\n\n\n\n\n<span class="a-declarative" data-action="a-popover" data-a-popover=\'{"dataStrategy":"preload","name":"sp-info-popover-B07F7XYMNN","position":"triggerVertical"}\'>\n    \n        \n        \n            \n            <span class="aok-inline-block s-info-icon"></span>\n        \n        \n    \n</span>\n\n\n\n    <div class="a-popover-preload" id="a-popover-sp-info-popover-B07F7XYMNN">\n        <span>These are ads for products you\'ll find on Amazon.com. </span><div class="a-row"><span>Clicking an ad will take you to the product\'s page.</span>\n\n\n\n\n<a class="a-link-normal" href="https://advertising.amazon.com/products-self-serve?ref_=ext_amzn_wtsp">\n    \n        \n            \n                <span>Learn more about Sponsored Products.</span>\n            \n        \n        \n    \n</a>\n</div><div class="a-row a-spacing-top-small"><span></span>\n\n\n\n\n\n<span class="a-declarative" data-action="a-modal" data-a-modal=\'{"dataStrategy":"ajax","header":"Share your feedback","url":"/gp/sponsored-products/lazyLoad/handler/sp-feedback-handler.html?pl=lB0mnfW%2BgTIwXZ6x6%2FEkofSU5xhfxxnnc785QXRwbCnTJ%2BJA6GNDd0mNO0LWBiOz0%2BPUMqC%2BfpmF%0Amq1O8nxbZonbXZAJRS9av%2BV16idXdoOcIz11gXk310EIW6PcOMYdRA%2Bp7Z%2FEeOGfKzIeFtB9qQvN%0AjDcmdjwVXrg9HYDbd1wHIPKQtdWBZdOar0OZInRL0%2F7pFW12O6KO3SMD%2B1v35y5myOGZmF51DLTd%0Ar0Eot11Sc8HtuVRMXgD1s8WIwu%2F0zt6zF3tg3EMcdWFtwMCECLKa2xwQfYLDM6NKIeQvOsky9j19%0A0mXHU7i%2FXQ9fL70%2Bf7m0aTvN8LwIHzdNZM6f6qiuarbVWcVp%2B1BM7Q0NyT33bHOLdwKR7DmhKH03%0AjWCKWNQnpeVaWAm%2BuwDwrOBtCI4voFa%2BK4IX5hBAzvyzgBCtATDpdIllsxGZj7dvvaFrkCdPE7w%2B%0AZ%2BzM8Bhfm4SU3nL%2FTRgbAEwM1b%2BMMAW3uFvcDWFHmWmwwTkce6uiLD5d83valaN%2FEgY03Xx0DsSd%0AROuQ8XS7DCpmy7rFmNXOEFSlWSOBf68IZKQniiJid90PjI9TySMEuuA%3D"}\'>\n    \n        \n        \n        \n            \n\n\n\n\n<a class="a-link-normal" href="#">\n    \n        \n            \n                <span></span>\n            \n        \n        \n    \n</a>\n\n        \n    \n</span>\n\n\n\n</div>\n    </div>\n\n</div>\n\n\n\n\n<h2 class="a-size-mini a-spacing-none a-color-base s-line-clamp-4">\n    \n    \n        \n\n\n\n\n<a class="a-link-normal a-text-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_1?ie=UTF8&amp;adId=A0414635KJLKI9OVJ7G1&amp;url=%2FBraun-Electric-Integrated-Precision-Rechargeable%2Fdp%2FB07F7XYMNN%2Fref%3Dsr_1_1_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-1-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf">\n    \n        \n            \n                <span class="a-size-base-plus a-color-base a-text-normal">Braun Series 9 Men\'s Electric Foil Shaver with Wet &amp; Dry Integrated Precision Trimmer &amp; Rechargeable and Cordless Razor with Clean&amp;Charge Station, 9296cc</span>\n            \n        \n        \n    \n</a>\n\n    \n</h2>\n\n        </div>\n        \n            <div class="a-section a-spacing-none a-spacing-top-micro">\n                <div class="a-row a-size-small">\n\n\n<span aria-label="4.4 out of 5 stars">\n    \n\n\n\n\n\n\n    \n        <span class="a-declarative" data-action="a-popover" data-a-popover=\'{"max-width":"700","closeButton":false,"position":"triggerBottom","url":"/review/widgets/average-customer-review/popover/ref=acr_search__popover?ie=UTF8&amp;asin=B07F7XYMNN&amp;ref=acr_search__popover&amp;contextId=search"}\'>\n            \n            <a href="javascript:void(0)" class="a-popover-trigger a-declarative"><i class="a-icon a-icon-star-small a-star-small-4-5 aok-align-bottom"><span class="a-icon-alt">4.4 out of 5 stars</span></i><i class="a-icon a-icon-popover"></i></a>\n        </span>\n    \n    \n\n\n</span>\n\n\n\n<span aria-label="89">\n    \n\n\n\n\n<a class="a-link-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_1?ie=UTF8&amp;adId=A0414635KJLKI9OVJ7G1&amp;url=%2FBraun-Electric-Integrated-Precision-Rechargeable%2Fdp%2FB07F7XYMNN%2Fref%3Dsr_1_1_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-1-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf#customerReviews">\n    \n        \n            \n                <span class="a-size-base">89</span>\n            \n        \n        \n    \n</a>\n\n</span>\n</div>\n            </div>\n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        \n            <div class="a-section a-spacing-none a-spacing-top-small">\n                <div class="a-row a-size-base a-color-base"><div class="a-row">\n\n\n\n\n<a class="a-size-base a-link-normal s-no-hover a-text-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_1?ie=UTF8&amp;adId=A0414635KJLKI9OVJ7G1&amp;url=%2FBraun-Electric-Integrated-Precision-Rechargeable%2Fdp%2FB07F7XYMNN%2Fref%3Dsr_1_1_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-1-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf">\n    \n        \n            \n                <span class="a-price" data-a-size="l" data-a-color="base"><span class="a-offscreen">$309.99</span><span aria-hidden="true"><span class="a-price-symbol">$</span><span class="a-price-whole">309<span class="a-price-decimal">.</span></span><span class="a-price-fraction">99</span></span></span>\n            \n                <span class="a-size-base a-color-secondary">($146.91/Pound)</span>\n            \n        \n        \n    \n</a>\n</div></div>\n            </div>\n        \n        \n            <div class="a-section a-spacing-none a-spacing-top-micro">\n                <div class="a-row a-size-base a-color-secondary s-align-children-center"><div class="a-row s-align-children-center">\n\n\n\n\n<span class="aok-inline-block s-image-logo-view">\n  <span class="aok-relative s-icon-text-medium s-prime">\n    <i class="a-icon a-icon-prime a-icon-medium" role="img" aria-label="Amazon Prime"></i>\n  </span>\n  <span>\n    \n  </span>\n</span>\n\n\n\n<span aria-label="Get it as soon as Thu, Jul 18">\n    <span>Get it as soon as </span><span class="a-text-bold">Thu, Jul 18</span>\n</span>\n</div><div class="a-row">\n\n\n<span aria-label="FREE Shipping by Amazon">\n    <span>FREE Shipping by Amazon</span>\n</span>\n</div></div>\n            </div>\n        \n        \n        \n        \n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        \n  </div></div>\n</div>\n</div>\n</div>\n\n</div>\n\n</div>\n\n</div></div>', '<div data-asin="B003YJAZZ4" data-index="1" class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 s-result-item sg-col-4-of-28 sg-col-4-of-16 AdHolder sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n    \n\n\n<div data-component-type="s-impression-logger" data-component-props=\'{"percentageShownToFire":"50","batchable":true,"requiredElementSelector":".s-image","url":"https://www.amazon.com/gp/sponsored-products/logging/log-action.html?qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf&amp;adId=200005657625311&amp;eventType=1&amp;adIndex=1"}\' class="rush-component s-expand-height">\n    \n\n\n<div data-component-type="sp-sponsored-result" class="rush-component s-expand-height">\n    \n\n\n\n\n\n\n\n\n<div class="s-expand-height s-include-content-margin s-border-bottom">\n<div class="a-section a-spacing-medium">\n\n\n<div class="sg-row">\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        <div class="a-section a-spacing-micro s-min-height-extra-large">\n            \n        </div>\n    </div></div>\n</div>\n\n<div class="sg-row">\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        <div class="a-section a-spacing-none">\n            \n\n\n\n\n\n<span data-component-type="s-product-image" class="rush-component">\n    \n    <a class="a-link-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_2?ie=UTF8&amp;adId=A05912003D82ZR77VHH5H&amp;url=%2FBraun-Electric-Shaver-Station-Cordless%2Fdp%2FB003YJAZZ4%2Fref%3Dsr_1_2_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-2-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf">\n        <div class="a-section aok-relative s-image-square-aspect">\n            \n                \n                \n                    <img src="https://m.media-amazon.com/images/I/814fSnCl0eL._AC_UL320_.jpg" class="s-image" alt="Braun Series 7 790cc-4 Electric Foil Shaver with Clean&amp;Charge Station, 1 Count" srcset="https://m.media-amazon.com/images/I/814fSnCl0eL._AC_UL320_.jpg 1x, https://m.media-amazon.com/images/I/814fSnCl0eL._AC_UL480_QL65_.jpg 1.5x, https://m.media-amazon.com/images/I/814fSnCl0eL._AC_UL640_QL65_.jpg 2x, https://m.media-amazon.com/images/I/814fSnCl0eL._AC_UL800_QL65_.jpg 2.5x, https://m.media-amazon.com/images/I/814fSnCl0eL._AC_UL960_QL65_.jpg 3x" data-image-index="1" data-image-load="" data-image-latency="s-product-image" data-image-source-density="1">\n                \n            \n        </div>\n    </a>\n</span>\n\n        </div>\n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        <div class="a-section a-spacing-none a-spacing-top-small">\n            <div class="a-row a-spacing-micro"><span class="a-size-base a-color-secondary">Sponsored</span>\n\n\n\n\n\n<span class="a-declarative" data-action="a-popover" data-a-popover=\'{"dataStrategy":"preload","name":"sp-info-popover-B003YJAZZ4","position":"triggerVertical"}\'>\n    \n        \n        \n            \n            <span class="aok-inline-block s-info-icon"></span>\n        \n        \n    \n</span>\n\n\n\n    <div class="a-popover-preload" id="a-popover-sp-info-popover-B003YJAZZ4">\n        <span>These are ads for products you\'ll find on Amazon.com. </span><div class="a-row"><span>Clicking an ad will take you to the product\'s page.</span>\n\n\n\n\n<a class="a-link-normal" href="https://advertising.amazon.com/products-self-serve?ref_=ext_amzn_wtsp">\n    \n        \n            \n                <span>Learn more about Sponsored Products.</span>\n            \n        \n        \n    \n</a>\n</div><div class="a-row a-spacing-top-small"><span></span>\n\n\n\n\n\n<span class="a-declarative" data-action="a-modal" data-a-modal=\'{"dataStrategy":"ajax","header":"Share your feedback","url":"/gp/sponsored-products/lazyLoad/handler/sp-feedback-handler.html?pl=KHcYnXwozHREynoNEPM3bbMnH%2FMI8s5SVP9nZyO9%2Bemx9E7m9uWGEQS3nBvHj%2BTpHlWCgtJxv65B%0A1fbbCIAb1ivlHLGe9bDi9XBrklhROSKeoWM3PpYPqCIFGwSusBYKhsKTSsijnU0hcE2kwEkX6dPm%0AYWYhITYXNoUJuWjUPV%2F%2F6IRqxkNKhbsOVZQoki56dZeC6ojhq78vV%2FZUUtSmf8LwehZjTMvF65xc%0A6jxI8nbjFJMvluPsl7BEX7ZfF08o13Ip%2BIY8y8%2BwZMH5SFcUbkqfJtQPYfy3WMrC3fT4zOAu3z5J%0AJau%2BZWLYs7GHgJnQj%2Ftw4VVjQjWJXdund5ND1rLuRP%2B5UnCqff0wXM%2BYZWrAdJKeLpWSavRDwfM2%0AIaPiuHxMS4F%2F0Y05HmeJp%2FPwPWN8YmGMKMoA4egrr2HhF8Yi9dIQLgWLgd%2FNM521RQprNbGbROEU%0ANuWR3XcKd53kwCC2WSt6ZQ5C8SuEjfhdDUxKtA9E4E%2FoKwyvW4OdOVBn1gS8o3cxME8l4IYhL23o%0AW4Z6tYRWSPdXW%2BvflMnJVqE%3D"}\'>\n    \n        \n        \n        \n            \n\n\n\n\n<a class="a-link-normal" href="#">\n    \n        \n            \n                <span></span>\n            \n        \n        \n    \n</a>\n\n        \n    \n</span>\n\n\n\n</div>\n    </div>\n\n</div>\n\n\n\n\n<h2 class="a-size-mini a-spacing-none a-color-base s-line-clamp-4">\n    \n    \n        \n\n\n\n\n<a class="a-link-normal a-text-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_2?ie=UTF8&amp;adId=A05912003D82ZR77VHH5H&amp;url=%2FBraun-Electric-Shaver-Station-Cordless%2Fdp%2FB003YJAZZ4%2Fref%3Dsr_1_2_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-2-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf">\n    \n        \n            \n                <span class="a-size-base-plus a-color-base a-text-normal">Braun Series 7 790cc-4 Electric Foil Shaver with Clean&amp;Charge Station, 1 Count</span>\n            \n        \n        \n    \n</a>\n\n    \n</h2>\n\n        </div>\n        \n            <div class="a-section a-spacing-none a-spacing-top-micro">\n                <div class="a-row a-size-small">\n\n\n<span aria-label="4.3 out of 5 stars">\n    \n\n\n\n\n\n\n    \n        <span class="a-declarative" data-action="a-popover" data-a-popover=\'{"max-width":"700","closeButton":false,"position":"triggerBottom","url":"/review/widgets/average-customer-review/popover/ref=acr_search__popover?ie=UTF8&amp;asin=B003YJAZZ4&amp;ref=acr_search__popover&amp;contextId=search"}\'>\n            \n            <a href="javascript:void(0)" class="a-popover-trigger a-declarative"><i class="a-icon a-icon-star-small a-star-small-4-5 aok-align-bottom"><span class="a-icon-alt">4.3 out of 5 stars</span></i><i class="a-icon a-icon-popover"></i></a>\n        </span>\n    \n    \n\n\n</span>\n\n\n\n<span aria-label="7,761">\n    \n\n\n\n\n<a class="a-link-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_2?ie=UTF8&amp;adId=A05912003D82ZR77VHH5H&amp;url=%2FBraun-Electric-Shaver-Station-Cordless%2Fdp%2FB003YJAZZ4%2Fref%3Dsr_1_2_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-2-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf#customerReviews">\n    \n        \n            \n                <span class="a-size-base">7,761</span>\n            \n        \n        \n    \n</a>\n\n</span>\n</div>\n            </div>\n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        \n            <div class="a-section a-spacing-none a-spacing-top-small">\n                <div class="a-row a-size-base a-color-base"><div class="a-row">\n\n\n\n\n<a class="a-size-base a-link-normal s-no-hover a-text-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_2?ie=UTF8&amp;adId=A05912003D82ZR77VHH5H&amp;url=%2FBraun-Electric-Shaver-Station-Cordless%2Fdp%2FB003YJAZZ4%2Fref%3Dsr_1_2_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-2-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf">\n    \n        \n            \n                <span class="a-price" data-a-size="l" data-a-color="base"><span class="a-offscreen">$199.94</span><span aria-hidden="true"><span class="a-price-symbol">$</span><span class="a-price-whole">199<span class="a-price-decimal">.</span></span><span class="a-price-fraction">94</span></span></span>\n            \n                <span class="a-price" data-a-size="b" data-a-strike="true" data-a-color="secondary"><span class="a-offscreen">$289.99</span><span aria-hidden="true"><span class="a-price-symbol">$</span><span class="a-price-whole">289<span class="a-price-decimal">.</span></span><span class="a-price-fraction">99</span></span></span>\n            \n        \n        \n    \n</a>\n</div></div><div class="a-row a-size-base a-color-secondary"><div class="a-row">\n\n\n\n\n\n<span data-component-type="s-coupon-component" data-component-props=\'{"asin":"B003YJAZZ4"}\' class="rush-component">\n    <span class="s-coupon-clipped aok-hidden">\n        <span class="a-color-base">$20.00 coupon applied.</span>\n    </span>\n    <span class="s-coupon-unclipped ">\n        \n\n\n<span class="a-size-base s-coupon-highlight-color s-highlighted-text-padding aok-inline-block">\n    Save $20.00\n</span>\n\n        <span class="a-color-base"> with coupon</span>\n    </span>\n    \n</span>\n</div></div>\n            </div>\n        \n        \n            <div class="a-section a-spacing-none a-spacing-top-micro">\n                <div class="a-row a-size-base a-color-secondary s-align-children-center"><div class="a-row s-align-children-center">\n\n\n\n\n<span class="aok-inline-block s-image-logo-view">\n  <span class="aok-relative s-icon-text-medium s-prime">\n    <i class="a-icon a-icon-prime a-icon-medium" role="img" aria-label="Amazon Prime"></i>\n  </span>\n  <span>\n    \n  </span>\n</span>\n\n\n\n<span aria-label="Get it as soon as Thu, Jul 18">\n    <span>Get it as soon as </span><span class="a-text-bold">Thu, Jul 18</span>\n</span>\n</div><div class="a-row">\n\n\n<span aria-label="FREE Shipping by Amazon">\n    <span>FREE Shipping by Amazon</span>\n</span>\n</div></div>\n            </div>\n        \n        \n        \n        \n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        \n  </div></div>\n</div>\n</div>\n</div>\n\n</div>\n\n</div>\n\n</div></div>', '<div data-asin="B01M716CC2" data-index="19" class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 s-result-item sg-col-4-of-28 sg-col-4-of-16 AdHolder sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n    \n\n\n<div data-component-type="s-impression-logger" data-component-props=\'{"percentageShownToFire":"50","batchable":true,"requiredElementSelector":".s-image","url":"https://www.amazon.com/gp/sponsored-products/logging/log-action.html?qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_mtf&amp;adId=200003291706211&amp;eventType=1&amp;adIndex=2"}\' class="rush-component s-expand-height">\n    \n\n\n<div data-component-type="sp-sponsored-result" class="rush-component s-expand-height">\n    \n\n\n\n\n\n\n\n\n<div class="s-expand-height s-include-content-margin s-border-bottom">\n<div class="a-section a-spacing-medium">\n\n\n<div class="sg-row">\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        <div class="a-section a-spacing-micro s-min-height-extra-large">\n            \n        </div>\n    </div></div>\n</div>\n\n<div class="sg-row">\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        <div class="a-section a-spacing-none">\n            \n\n\n\n\n\n<span data-component-type="s-product-image" class="rush-component">\n    \n    <a class="a-link-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_mtf_aps_sr_pg1_1?ie=UTF8&amp;adId=A0238051Q7V9JY6EMHYF&amp;url=%2FBraun-Electric-Shaver-9290cc-Travel%2Fdp%2FB01M716CC2%2Fref%3Dsr_1_19_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-19-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_mtf">\n        <div class="a-section aok-relative s-image-square-aspect">\n            \n                \n                \n                    <img src="https://m.media-amazon.com/images/I/81CmsUO2IzL._AC_UL320_.jpg" class="s-image" alt="Braun Series 9 9290cc Electric Razor for Men, Rechargeable and Cordless Electric Shaver, Foil Shaver, Silver, with Clean&amp;Charge Station and Travel Case" srcset="https://m.media-amazon.com/images/I/81CmsUO2IzL._AC_UL320_.jpg 1x, https://m.media-amazon.com/images/I/81CmsUO2IzL._AC_UL480_QL65_.jpg 1.5x, https://m.media-amazon.com/images/I/81CmsUO2IzL._AC_UL640_QL65_.jpg 2x, https://m.media-amazon.com/images/I/81CmsUO2IzL._AC_UL800_QL65_.jpg 2.5x, https://m.media-amazon.com/images/I/81CmsUO2IzL._AC_UL960_QL65_.jpg 3x" data-image-index="19" data-image-load="" data-image-latency="s-product-image" data-image-source-density="1">\n                \n            \n        </div>\n    </a>\n</span>\n\n        </div>\n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        <div class="a-section a-spacing-none a-spacing-top-small">\n            <div class="a-row a-spacing-micro"><span class="a-size-base a-color-secondary">Sponsored</span>\n\n\n\n\n\n<span class="a-declarative" data-action="a-popover" data-a-popover=\'{"dataStrategy":"preload","name":"sp-info-popover-B01M716CC2","position":"triggerVertical"}\'>\n    \n        \n        \n            \n            <span class="aok-inline-block s-info-icon"></span>\n        \n        \n    \n</span>\n\n\n\n    <div class="a-popover-preload" id="a-popover-sp-info-popover-B01M716CC2">\n        <span>These are ads for products you\'ll find on Amazon.com. </span><div class="a-row"><span>Clicking an ad will take you to the product\'s page.</span>\n\n\n\n\n<a class="a-link-normal" href="https://advertising.amazon.com/products-self-serve?ref_=ext_amzn_wtsp">\n    \n        \n            \n                <span>Learn more about Sponsored Products.</span>\n            \n        \n        \n    \n</a>\n</div><div class="a-row a-spacing-top-small"><span></span>\n\n\n\n\n\n<span class="a-declarative" data-action="a-modal" data-a-modal=\'{"dataStrategy":"ajax","header":"Share your feedback","url":"/gp/sponsored-products/lazyLoad/handler/sp-feedback-handler.html?pl=0p%2BGEEYfSi3REr7K6Ac9Nqe9JjlS1kupaJ7MTBFbv7xen2TPg5Oc%2BN2Ae163fnU01qKZeDgPtGas%0A0Feh9ykvFdqBAZMnMfCHs4k%2Beht6HW%2FzanzhIRMebuDSFcmpXsxMIlEyihB6RIC2LnQNUhfy8i3x%0AWhSijAPnRknldtvl%2BiXb%2FomJhnuVZuVr0qxvvFXe4cjEudG1ABX946GnadxoboiHHfy9GwF6QF1b%0ATdmSjB%2BE7yy3HB3B6E9ImtbgoqBIk4aSkqRyXuahRoAp1brZO3Nn3qFPYXDIG2%2F%2BCDzJndYLL%2FCK%0AVvZ3R6lN42KA6oTI4CxoMs%2FmfiN7P85KWyTeS8YX6ICcjkaIjnvxjOCDp%2FX8%2FDrKYNWc8GrdY4Fb%0ABQSlas58beh5VfDSQ0Tiwe3TkLoIXzEGFfsIPEa2OP0AyJWut4dsSB%2FQ%2FHzx71c27lH0R6cIGGdV%0APA0iOLrWXAYZpM6VBSEcKJb4Zu%2F0bnzB%2Be6s9yF%2FduaUtfqsiXgXyypf3TA1wNADbj0mPJp1Fj5W%0AEdvQxCk8Z%2B6fIpGfW%2Bh6d%2BI4PP3QDZF6yjN5%2FFd1jf8O%2BXp5AefGs38%3D"}\'>\n    \n        \n        \n        \n            \n\n\n\n\n<a class="a-link-normal" href="#">\n    \n        \n            \n                <span></span>\n            \n        \n        \n    \n</a>\n\n        \n    \n</span>\n\n\n\n</div>\n    </div>\n\n</div>\n\n\n\n\n<h2 class="a-size-mini a-spacing-none a-color-base s-line-clamp-4">\n    \n    \n        \n\n\n\n\n<a class="a-link-normal a-text-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_mtf_aps_sr_pg1_1?ie=UTF8&amp;adId=A0238051Q7V9JY6EMHYF&amp;url=%2FBraun-Electric-Shaver-9290cc-Travel%2Fdp%2FB01M716CC2%2Fref%3Dsr_1_19_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-19-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_mtf">\n    \n        \n            \n                <span class="a-size-base-plus a-color-base a-text-normal">Braun Series 9 9290cc Electric Razor for Men, Rechargeable and Cordless Electric Shaver, Foil Shaver, Silver, with Clean&amp;Charge Station and Travel Case</span>\n            \n        \n        \n    \n</a>\n\n    \n</h2>\n\n        </div>\n        \n            <div class="a-section a-spacing-none a-spacing-top-micro">\n                <div class="a-row a-size-small">\n\n\n<span aria-label="3.9 out of 5 stars">\n    \n\n\n\n\n\n\n    \n        <span class="a-declarative" data-action="a-popover" data-a-popover=\'{"max-width":"700","closeButton":false,"position":"triggerBottom","url":"/review/widgets/average-customer-review/popover/ref=acr_search__popover?ie=UTF8&amp;asin=B01M716CC2&amp;ref=acr_search__popover&amp;contextId=search"}\'>\n            \n            <a href="javascript:void(0)" class="a-popover-trigger a-declarative"><i class="a-icon a-icon-star-small a-star-small-4 aok-align-bottom"><span class="a-icon-alt">3.9 out of 5 stars</span></i><i class="a-icon a-icon-popover"></i></a>\n        </span>\n    \n    \n\n\n</span>\n\n\n\n<span aria-label="1,079">\n    \n\n\n\n\n<a class="a-link-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_mtf_aps_sr_pg1_1?ie=UTF8&amp;adId=A0238051Q7V9JY6EMHYF&amp;url=%2FBraun-Electric-Shaver-9290cc-Travel%2Fdp%2FB01M716CC2%2Fref%3Dsr_1_19_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-19-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_mtf#customerReviews">\n    \n        \n            \n                <span class="a-size-base">1,079</span>\n            \n        \n        \n    \n</a>\n\n</span>\n</div>\n            </div>\n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        \n            <div class="a-section a-spacing-none a-spacing-top-small">\n                <div class="a-row a-size-base a-color-base"><div class="a-row">\n\n\n\n\n<a class="a-size-base a-link-normal s-no-hover a-text-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_mtf_aps_sr_pg1_1?ie=UTF8&amp;adId=A0238051Q7V9JY6EMHYF&amp;url=%2FBraun-Electric-Shaver-9290cc-Travel%2Fdp%2FB01M716CC2%2Fref%3Dsr_1_19_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-19-spons%26psc%3D1&amp;

要对此进行扩展,您可能需要执行以下操作:

for result in response.css('.s-result-list div'):
    if result.css('.AdHolder').extract_first():
        item['adholder'] = True
    else:
        item['adholder'] = False
    rest of item logic

我对这种方式并不完全满意,但我认为它会起作用。