使用硒从网站上抓取数据

时间:2019-10-22 03:35:38

标签: python-3.x selenium web-scraping

我对python还是很业余,我正在尝试使用硒从网站上抓取数据

 <small class="fxs_price_ohl"> <span>Open 1.29814</span> <span>High 1.29828</span> <span>Low 1.29775</span> </small> </div> </div> </li> <script type="application/ld+json">

试图从上面的html代码中获取数据Open 1.29814,High 1.29828和Low 1.29775 ^

count_element = browser.find_element_by_xpath("//small[@class='fxs_price_ohl']//span")
print(count_element.text)

我在python中使用硒,这是我的代码^ 但是count_element.text打印为空,如何获取数据打开1.29814,高1.29828和低1.29775

2 个答案:

答案 0 :(得分:0)

使用

  

“ find_element s _by_xpath”

如果要检索多个元素。

count_elements = browser.find_elements_by_xpath("//small[@class='fxs_price_ohl']//span")
for ele in count_elements:
    print(ele.text)

答案 1 :(得分:0)

您还可以为具有后代组合器的父级使用class的css选择器,为子范围使用类型选择器,但是由于页面加载缓慢,您还需要等待条件

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

browser = webdriver.Chrome()
browser.get('https://www.fxstreet.com/rates-charts/gbpusd')
before_text = ''

while True: #this could be improved with a timeout
    elements = [i for i in WebDriverWait(browser,20).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".fxs_chart_cag_cont .fxs_price_ohl span")))]
    elem = elements[-1]
    if elem.text != before_text:
        break
print([elem.text for elem in elements])