我已经写了一些脚本来从craigslist中删除姓名和价格。它可以顺利运行,直到发现任何一个谷值为无。只要它获得任何None值,它就会中断显示:"列出索引超出范围"。如何处理?
import requests
from lxml import html
page = requests.get('http://bangalore.craigslist.co.in/search/rea?s=120').text
tree = html.fromstring(page)
rows = tree.xpath('//li[@class="result-row"]')
for row in rows:
link = row.xpath('.//a[contains(@class,"hdrlnk")]/text()')[0]
price = row.xpath('.//span[@class="result-price"]/text()')[0]
print (link,price)
答案 0 :(得分:0)
到目前为止,我遇到了最有效的技术,以避免错误。
import requests
from lxml import html
page = requests.get('http://bangalore.craigslist.co.in/search/rea?s=120').text
tree = html.fromstring(page)
def if_exist(row,xpath):
docs=row.xpath(xpath)
if docs:
return docs[0]
return ""
for row in tree.xpath('//li[@class="result-row"]'):
link = if_exist(row,'.//a[contains(@class,"hdrlnk")]/text()')
price = if_exist(row,'.//span[@class="result-price"]/text()')
print (link,price)