我也需要从xml查询中获取空值,该查询将链接作为结果数组返回。在某些程度上没有链接可以引用。打印时,相应的空结果不会被打印。
目标是获取相应学位的链接。
我的代码是:
postgraduatedegrees=tree.xpath('//*[@id="block-scholarly-
content"]/div/article/div/div/div//*[contains(text(),"Degree
of")]/text()')
postgraduatedegreeslinks=tree.xpath('//*[@id="block-scholarly-
content"]/div/article/div/div/div//*[contains(text(),"Degree of")]/@href')
Output:
len(postgraduatedegrees)
Out[222]: 52
len(postgraduatedegreeslinks)
Out[223]: 40
空值将被删除。 请帮助我解决问题
答案 0 :(得分:1)
解决方案是
url="the url of the web page"
page = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
tree = html.fromstring(page.content)
postgraduate=tree.xpath('//*[@id="block-scholarly-content"]/div/article/div/div/div//*[contains(text(),"Degree of")]')
for pg in postgraduate:
pgcourse= pg.xpath('.//text()')
pglink=pg.xpath('.//@href')
for循环也将通过空结果进行迭代。