Question

我正在做网页抓取。我需要提取一个数据字段（典型），有3种可能的情况：

try:
#1st case
typo = int(response.xpath('//td[contains(text(),"Chambres")]/following- 
sibling::td[@class="right"]/text()').extract()[0])                       
except: 
#2nd case when the 1st case gives an IndexError    
typo = int(sel1.xpath('//td[contains(text(),"Pièces (nombre total)")]/following-sibling::td[@class="right"]/text()').extract()[0])
except IndexError: 
#3rd case, when the first and second case give IndexError       
typo = 0

我遇到执行错误（必须是最后一个）

Answer 1

您要嵌套try条语句：

try:
    x = response.xpath('//td[contains(text(),"Chambres")]/following-sibling::td[@class="right"]/text()')
    typo = int(x.extract()[0])
except IndexError:
    try:
        x = sel1.xpath('//td[contains(text(),"Pièces (nombre total)")]/following-sibling::td[@class="right"]/text()')
        typo = int(x.extract()[0])
    except IndexError:
        typo = 0

您可以使用循环将其简化一下：

attempts = [
    (response.xpath, '//td...'),
    (sel1.xpath, '/td...'),
]
typo = 0
for f, arg in attempts:
    try:
        typo = int(f(arg).extract()[0])
    except IndexError:
        continue

typo初始化为后备值，但如果任何一次尝试的解析成功，将被覆盖。

尝试异常处理时出错....除了3种可能的情况

1 个答案: