无法绕过列表索引错误

时间:2017-05-25 21:02:30

标签: python-3.x web-scraping

我已经写了一些脚本来从craigslist中删除姓名和价格。它可以顺利运行,直到发现任何一个谷值为无。只要它获得任何None值,它就会中断显示:"列出索引超出范围"。如何处理?

import requests
from lxml import html

page = requests.get('http://bangalore.craigslist.co.in/search/rea?s=120').text
tree = html.fromstring(page)
rows = tree.xpath('//li[@class="result-row"]')
for row in rows:
    link = row.xpath('.//a[contains(@class,"hdrlnk")]/text()')[0]
    price = row.xpath('.//span[@class="result-price"]/text()')[0]
    print (link,price)

1 个答案:

答案 0 :(得分:0)

到目前为止,我遇到了最有效的技术,以避免错误。

import requests
from lxml import html

page = requests.get('http://bangalore.craigslist.co.in/search/rea?s=120').text
tree = html.fromstring(page)

def if_exist(row,xpath):
    docs=row.xpath(xpath)
    if docs:
        return docs[0]
    return ""

for row in tree.xpath('//li[@class="result-row"]'):
    link = if_exist(row,'.//a[contains(@class,"hdrlnk")]/text()')
    price = if_exist(row,'.//span[@class="result-price"]/text()')
    print (link,price)