在Python中抓取动态页面

时间:2020-04-02 16:36:58

标签: python selenium web-scraping beautifulsoup scrapy

我正在尝试在python中抓取this website。当我们输入公司代码(示例6177)时,URL不会更改,但是页面及其上的值会更改。

只有一个单元需要抓取。屏幕截图附有确切的单元格。单元格的地址是:

xpath - //*[@id="company"]/table[3]/tbody/tr[4]/td[1]
cssselector - #company > table:nth-child(17) > tbody > tr:nth-child(4) > td:nth-child(1)

我应该怎么做?

谢谢!

enter image description here

1 个答案:

答案 0 :(得分:1)

要从表中获取文本190,843,请诱导WebDriverWait()和visibility_of_element_located()并使用以下xpath

from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.keys import Keys

driver=webdriver.Chrome()
driver.get("https://mops.twse.com.tw/mops/web/index")
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.ID,"keyword"))).send_keys("6177",Keys.ENTER)
print(WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.XPATH,"//div[text()='營收資訊']/following::table[1]//tr[4]/td[1]"))).text)

输出

190,843