Selenium / Python如何在使用Selenium扩展文本后获取全文?

时间:2017-11-17 15:56:58

标签: python selenium-webdriver

我正在尝试从TripAdvisor上搜索评论,对于长时间的评论,只显示部分评论,需要点击“更多”才能显示完整评论。我点击了更多后尝试获取文本(我可以看到文本 已展开),但我得到的只是部分审核。

我的代码(删除一个特定的评论)如下:

driver = webdriver.Firefox()
driver.get(url)
review = driver.find_element_by_id("review_541350982") 
review.find_element_by_class_name("taLnk.ulBlueLinks").click()
driver.wait = WebDriverWait(driver, 5)
new_review = driver.find_element_by_id("review_541350982")
entry = new_review.find_element_by_class_name("partial_entry")
print entry.text

在点击“更多”之前,这是HTML:

<p class="partial_entry">This place blah blah blah What an...
<span class="taLnk ulBlueLinks" onclick="widgetEvCall('handlers.clickExpand',event,this);">More</span>
</p>

这是后面的HTML:

<p class="partial_entry">This place blah blah blah What an incredible monument from both a historic and construction point of view.</p>
<span class="taLnk ulBlueLinks" onclick="widgetEvCall('handlers.clickCollapse',event,this);">Show less</span>

我注意到点击“更多”后<span> <p>后出现var numOfStudents=10 submitButton=document.getElementById("submitButton") submitButton.addEventListener("click", () => { var newTable = document.createElement("table"); for(let i = 0; i < parseInt(numOfStudents); i++){ var newRow = document.createElement("tr"); var newInput = document.createElement("td"); var newInput2 = document.createElement("td"); newRow.appendChild(newInput); newRow.appendChild(newInput2); newTable.appendChild(newRow); document.body.appendChild(newTable); } });。不确定这是否有用。

非常感谢任何建议!

编辑:注意引入time.sleep(1)而不是driver.wait解决了问题。不知道有没有更好的方法来做到这一点,以便新的条目在更改后自动获得,而不必设置任意等待时间?

2 个答案:

答案 0 :(得分:1)

从您的代码中可以明显看出 WebDriverWait 虽然定义但未正确使用。要打印全文 This place blah blah blah What an incredible monument from both a historic and construction point of view. ,您可以使用以下代码块:

from selenium.webdriver.support import expected_conditions as EC
#code block
review = driver.find_element_by_id("review_541350982") 
review.find_element_by_class_name("taLnk.ulBlueLinks").click()
new_review = driver.find_element_by_id("review_541350982")
full_review = WebDriverWait(driver, 10).until(EC.text_to_be_present_in_element(new_review.find_element_by_xpath("//p[@class='partial_entry']"),'This place blah blah blah What an incredible monument from both a historic and construction point of view.'))
entry = new_review.find_element_by_class_name("partial_entry")
print entry.text

答案 1 :(得分:0)

找到评论并点击更多:

review = driver.find_element_by_id("review_541350982")
partial_text = review.find_element_by_tag_name('p')
partial_text.find_element_by_tag_name('span').click()

使用XPath重新定位评论并输出文本:

new_review = driver.find_element_by_xpath('(//*[@id="review_541350982"]//p)[1]')
print(new_review.text)

HTH