我需要解析页面上所有父元素中的一些子元素。
在页面上创建所有文章的列表
article_elements = driver.find_elements_by_tag_name('article')
在绑定以获取for循环中的子元素并将所有结果附加到列表
之后for article in article_elements:
title = article.find_element_by_xpath('//article/h2').text
share_count = article.find_element_by_xpath('//footer/div/a/span').text
poinst = article.find_element_by_xpath('//footer/div[2]/div[1]/div[3]').text
meta_info_list.append({'title':title, 'share count':share_count, 'points':poinst})
循环结束后,我得到了40次同一篇文章meta(第一篇文章)
{'share count': u'66', 'points': u'53 points', 'title': u'25+ Random Acts Of Genius Vandalism'}
{'share count': u'66', 'points': u'53 points', 'title': u'25+ Random Acts Of Genius Vandalism'}
{'share count': u'66', 'points': u'53 points', 'title': u'25+ Random Acts Of Genius Vandalism'}
{'share count': u'66', 'points': u'53 points', 'title': u'25+ Random Acts Of Genius Vandalism'}
... 40 times
我的整个代码
# coding: utf8
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
driver = webdriver.Chrome()
driver.set_window_size(1024,768)
driver.get('http://www.boredpanda.com/')
time.sleep(2)
meta_info_list = []
article_elements = driver.find_elements_by_tag_name('article')
for article in article_elements:
title = article.find_element_by_xpath('//article/h2').text
share_count = article.find_element_by_xpath('//footer/div/a/span').text
poinst = article.find_element_by_xpath('//footer/div[2]/div[1]/div[3]').text
meta_info_list.append({'title':title, 'share count':share_count, 'points':poinst})
for list in meta_info_list:
print(list)
答案 0 :(得分:4)
循环中的XPath表达式必须以点开头才能特定于上下文:
cv::Mat
作为旁注,您可以使用列表理解缩短代码:
for article in article_elements:
title = article.find_element_by_xpath('.//article/h2').text
share_count = article.find_element_by_xpath('.//footer/div/a/span').text
poinst = article.find_element_by_xpath('.//footer/div[2]/div[1]/div[3]').text
meta_info_list.append({'title':title, 'share count':share_count, 'points':poinst})