我正在做网页抓取。 我需要提取一个数据字段(典型),有3种可能的情况:
try:
#1st case
typo = int(response.xpath('//td[contains(text(),"Chambres")]/following-
sibling::td[@class="right"]/text()').extract()[0])
except:
#2nd case when the 1st case gives an IndexError
typo = int(sel1.xpath('//td[contains(text(),"Pièces (nombre total)")]/following-sibling::td[@class="right"]/text()').extract()[0])
except IndexError:
#3rd case, when the first and second case give IndexError
typo = 0
我遇到执行错误(必须是最后一个)
答案 0 :(得分:2)
您要嵌套try
条语句:
try:
x = response.xpath('//td[contains(text(),"Chambres")]/following-sibling::td[@class="right"]/text()')
typo = int(x.extract()[0])
except IndexError:
try:
x = sel1.xpath('//td[contains(text(),"Pièces (nombre total)")]/following-sibling::td[@class="right"]/text()')
typo = int(x.extract()[0])
except IndexError:
typo = 0
您可以使用循环将其简化一下:
attempts = [
(response.xpath, '//td...'),
(sel1.xpath, '/td...'),
]
typo = 0
for f, arg in attempts:
try:
typo = int(f(arg).extract()[0])
except IndexError:
continue
typo
初始化为后备值,但如果任何一次尝试的解析成功,将被覆盖。