使用selenium和python提取没有任何class和id的div标签

时间:2018-06-18 16:04:46

标签: python selenium-webdriver

我正在尝试使用selenium从网站获取每小时和每日水位数据。

我的代码到目前为止

import time 
from selenium import webdriver
from selenium.webdriver.support.ui import Select

driver = webdriver.Chrome(r"C:\Python27\chromedriver.exe")
driver.get('http://hydrology.gov.np/#/basin/77?_k=1r1onx')
time.sleep(5)
driver.find_elements_by_xpath("//*[contains(text(), 'Hourly')]").click()

但它不起作用。我是硒的新手。我也尝试过其他方法来找到元素,但没有成功。在这方面,我将不胜感激。

来自html的代码

  div style="padding: 16px 0px; display: table-cell; user-select: none; width: 96px;">
    <div>
        <span tabindex="0" style="border: 10px; box-sizing: border-box; display: block; font-family: Roboto, sans-serif; -webkit-tap-highlight-color: rgba(0, 0, 0, 0); cursor: pointer; text-decoration: none; margin: 0px; padding: 0px; outline: none; font-size: 15px; font-weight: inherit; position: relative; color: rgba(0, 0, 0, 0.87); line-height: 32px; transition: all 450ms cubic-bezier(0.23, 1, 0.32, 1) 0ms; min-height: 32px; white-space: nowrap; background: none;">
            <div>
                <div style="margin-left: 0px; padding: 0px 24px; position: relative;">
                    <div>Point</div>
                </div>
            </div>
        </span>
    </div>
    <div>
        <span tabindex="0" style="border: 10px; box-sizing: border-box; display: block; font-family: Roboto, sans-serif; -webkit-tap-highlight-color: rgba(0, 0, 0, 0); cursor: pointer; text-decoration: none; margin: 0px; padding: 0px; outline: none; font-size: 15px; font-weight: inherit; position: relative; color: rgb(255, 64, 129); line-height: 32px; transition: all 450ms cubic-bezier(0.23, 1, 0.32, 1) 0ms; min-height: 32px; white-space: nowrap; background: none;">
            <div>
                <div style="margin-left: 0px; padding: 0px 24px; position: relative;">
                    <div>Hourly</div>
                </div>
            </div>
        </span>
    </div>
    <div>
        <span tabindex="0" style="border: 10px; box-sizing: border-box; display: block; font-family: Roboto, sans-serif; -webkit-tap-highlight-color: rgba(0, 0, 0, 0); cursor: pointer; text-decoration: none; margin: 0px; padding: 0px; outline: none; font-size: 15px; font-weight: inherit; position: relative; color: rgba(0, 0, 0, 0.87); line-height: 32px; transition: all 450ms cubic-bezier(0.23, 1, 0.32, 1) 0ms; min-height: 32px; white-space: nowrap; background: none;">
            <div>
                <div style="margin-left: 0px; padding: 0px 24px; position: relative;">
                    <div>Daily</div>
                </div>
            </div>
        </span>
    </div>
</div>

2 个答案:

答案 0 :(得分:1)

这个

driver.find_elements_by_xpath("//*[contains(text(), 'Hourly')]").click()

使用find_elements将返回元素列表。您必须遍历列表才能单击它们。

要获取每小时数据,您需要将此行替换为

table_body = driver.find_element_by_css_selector('table.table.tab-bar > tbody')
print(table_body.text)

将提供类似

的数据
Jun 18, 2018 6:00 AM 1.94 -0.15 0.89
Mon, Jun 18, 2018 7:00 AM 1.93 1.92 1.93
Mon, Jun 18, 2018 8:00 AM 1.91 1.91 1.91
Mon, Jun 18, 2018 9:00 AM 1.9 1.89 1.89
Mon, Jun 18, 2018 10:00 AM 1.88 1.88 1.88
Mon, Jun 18, 2018 11:00 AM 1.87 1.87 1.87
Mon, Jun 18, 2018 12:00 PM 1.88 1.88 1.88
Mon, Jun 18, 2018 1:00 PM 1.88 1.87 1.88
Mon, Jun 18, 2018 2:00 PM 1.84 1.84 1.84
Mon, Jun 18, 2018 3:00 PM 1.86 1.84 1.85
Mon, Jun 18, 2018 4:00 PM 1.77 1.77 1.77
Mon, Jun 18, 2018 5:00 PM 1.68 1.68 1.68
Mon, Jun 18, 2018 6:00 PM 1.63 1.62 1.62
Mon, Jun 18, 2018 7:00 PM 1.62 1.62 1.62

答案 1 :(得分:1)

//div[contains(text(),'Hourly')]根据HTML,你的xpath似乎是正确的,请检查调试器,你将找到问题所在,

您可以通过3种方式点击元素:

  1. driver.click()
  2. 操作点击

    from selenium.webdriver.common.action_chains import ActionChains
    targetElement = driver.find_element_by_xpath("//div[contains(text(),'Hourly')]")
    actions = ActionChains(driver)
    actions.move_to_element(targetElement).click().perform()

  3. javascript执行者点击

    driver.execute_script("arguments[0].click();", element)