我在scrapy,python 3.0中构建这个蜘蛛。问题是每当我使用规则时,它会为def parse_productPage提供错误“无效语法”。当我删除规则时它不会抱怨并且工作正常。我找不到代码有什么问题。你能帮我么。这是代码
import scrapy
from quo.items import QuoItem
from scrapy.linkextractors import LinkExtractor
from scrapy.spiders import CrawlSpider, Rule
class ISpider(CrawlSpider):
name='iShopE'
allowed_domains = ['ishopping.pk']
start_urls = ['https://www.ishopping.pk/electronics/home-theatres.html']
rules = (
Rule(LinkExtractor(restrict_xpaths=('//div["category-products-"]'), follow=True),
Rule(LinkExtractor(restrict_xpaths=('//h2[@class="product-name"]/a/@href'), callback='parse_productPage'),
)
def parse_productPage(self,response):
for rev in response.xpath('//div["product-essential"]'):
item=QuoItem()
price=response.xpath('//div[@class="price-box"]/span[@class="regular-price"]/meta[@itemprop="price"]/@content').extract()
if price:
item['price']=price
Availability=response.xpath('//p[@class="availability in-stock"]/span[@class="value"]/text()').extract()
if Availability:
item['Availability']=Availability
Brand=response.xpath('(//div[@class="box-p-attr"]/span)[1]/text()').extract()
if Brand:
item['Brand']=Brand
deliveryTime=response.xpath('(//div[@class="box-p-attr"]/span)[2]/text()').extract()
if deliveryTime:
item['deliveryTime']=deliveryTime
Waranty=response.xpath('(//div[@class="box-p-attr"]/span)[3]/text()').extract()
if Waranty:
item['Waranty']=Waranty
yield item
这是输出日志Output log
答案 0 :(得分:0)
与错误消息显示的不同,问题实际上在以前的行中:
rules = (
Rule(LinkExtractor(restrict_xpaths=('//div["category-products-"]'), follow=True),
Rule(LinkExtractor(restrict_xpaths=('//h2[@class="product-name"]/a/@href'), callback='parse_productPage'),
)
如果你仔细地计算括号,你可以看到有三个开括号但每个规则只有两个关闭:
Rule(
LinkExtractor(
restrict_xpaths=('//div["category-products-"]'),
follow=True
)
所以要解决这个问题,只需为每个Rule
添加一个右括号:
rules = (
Rule(LinkExtractor(restrict_xpaths=('//div["category-products-"]'), follow=True)),
Rule(LinkExtractor(restrict_xpaths=('//h2[@class="product-name"]/a/@href'), callback='parse_productPage')),
)