我试图只打印非空值,但我不确定为什么即使空值出现在输出中:
输入:
from lxml import html
import requests
import linecache
i=1
read_url = linecache.getline('stocks_url',1)
while read_url != '':
page = requests.get(read_url)
tree = html.fromstring(page.text)
percentage = tree.xpath('//span[@class="grnb_20"]/text()')
if percentage != None:
print percentage
i = i + 1
read_url = linecache.getline('stocks_url',i)
输出:
$ python test_null.py
['76%']
['76%']
['80%']
['92%']
['77%']
['71%']
[]
['50%']
[]
['100%']
['67%']
答案 0 :(得分:0)
您将获得空列表,而不是None
个对象。你在这里测试错误的东西;你看到[]
,而如果返回Python null,你会看到None
。 Element.xpath()
方法总是返回一个列表对象,它可以为空。
使用布尔测试:
percentage = tree.xpath('//span[@class="grnb_20"]/text()')
if percentage:
print percentage[0]
空列表(和None
)在布尔上下文中测试为false。我选择从XPath结果中打印出第一个元素,你似乎只有一个元素。
请注意linecache
主要用于缓存Python源文件;它用于在发生错误时以及使用inspect.getsource()
时显示回溯。它并不真正意味着用于读取文件。您可以使用open()
并循环遍历文件,而无需继续递增计数器:
with open('stocks_url') as urlfile:
for url in urlfile:
page = requests.get(read_url)
tree = html.fromstring(page.content)
percentage = tree.xpath('//span[@class="grnb_20"]/text()')
if percentage:
print percentage[0]