我如何使用Selenium和Python从定位的元素中抓取文本

时间:2019-05-05 18:02:53

标签: python selenium xpath css-selectors webdriverwait

我正在尝试运行以下代码

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
options = Options()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(options=options)
driver.get('https://theunderminejournal.com/#eu/draenor/battlepet/1155')
time.sleep(20) #bypass cloudflare
price = driver.find_element_by_xpath('//*[@id="battlepet-page"]/div[1]/table/tr[3]/td/span')
print (price) 

因此,我可以从页面上抓取“当前价格”。但是这个xpath位置不会返回文本值(我也尝试了“文本”变量,但没有成功。

在此先感谢您的回复

3 个答案:

答案 0 :(得分:2)

首先,使用WebdriverWait等待该元素而不是睡眠。

第二,您的定位器找不到元素。

尝试一下,

driver.get('https://theunderminejournal.com/#eu/draenor/battlepet/1155')
price = WebDriverWait(driver,30).until(EC.visibility_of_element_located((By.XPATH,"//div[@id='battlepet-page']/div/table/tr[@class='current-price']/td/span")))

print(price.text)

要使用wait导入以下内容,

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

答案 1 :(得分:1)

获取文本之前,您应该等待元素的可见性。在以下示例中检查WebDriverWait

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.support.ui import WebDriverWait
rom selenium.webdriver.chrome.options import Options

options = Options()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(options=options)

wait = WebDriverWait(driver, 20)

driver.get('https://theunderminejournal.com/#eu/draenor/battlepet/1155')
current_price = wait.until(ec.visibility_of_element_located((By.CSS_SELECTOR, ".current-price .price"))).text

print(current_price)

答案 2 :(得分:0)

要从webpage刮取当前价格的值,您需要为visibility_of_element_located()引入WebDriverWait,并且可以使用以下任一Locator Strategies

  • 使用CSS_SELECTOR

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "tr.current-price td>span"))).text)
    
  • 使用XPATH

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//th[text()='Current Price']//following::td[1]/span"))).text)
    
  • 注意:您必须添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC