xpath的空结果未打印

时间:2018-09-03 10:26:04

标签: python xml href

我也需要从xml查询中获取空值,该查询将链接作为结果数组返回。在某些程度上没有链接可以引用。打印时,相应的空结果不会被打印。

目标是获取相应学位的链接。

我的代码是:

  postgraduatedegrees=tree.xpath('//*[@id="block-scholarly- 
  content"]/div/article/div/div/div//*[contains(text(),"Degree 
  of")]/text()')

  postgraduatedegreeslinks=tree.xpath('//*[@id="block-scholarly- 
  content"]/div/article/div/div/div//*[contains(text(),"Degree of")]/@href')

  Output:
   len(postgraduatedegrees)
   Out[222]: 52

  len(postgraduatedegreeslinks)
   Out[223]: 40  

空值将被删除。 请帮助我解决问题

1 个答案:

答案 0 :(得分:1)

解决方案是

url="the url of the web page"
page = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
tree = html.fromstring(page.content)
postgraduate=tree.xpath('//*[@id="block-scholarly-content"]/div/article/div/div/div//*[contains(text(),"Degree of")]')
for pg in postgraduate:
   pgcourse= pg.xpath('.//text()')
   pglink=pg.xpath('.//@href')

for循环也将通过空结果进行迭代。