我正在尝试从此page中获取股票代码。
这是我的代码:
from selenium import webdriver
import pandas as pd
url = 'https://stock360.hkej.com/StockScreener/profession/tab/profile'
browser = webdriver.Chrome('C:/chromedriver_win32/chromedriver.exe')
browser.get(url)
dfs = pd.read_html(browser.page_source)
print(dfs)
browser.close()
这是输出:
dfs
[ 0
0 加入至心水組合:請先登入或註冊成為會員, Empty DataFrame
Columns: [沒有符合以上篩選條件的股票。]
Index: [], 0
0 加入至心水組合:請先登入或註冊成為會員]
我知道这是JavaScript,并且我已经使用过Selenium。我怎么没桌子?以及如何在页面中显示股票代码,如下所示?谢谢。
其他信息:点击链接后,从绿色下拉列表中选择第二个,然后将显示上表。
答案 0 :(得分:1)
一种方法如下
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd
url = 'https://stock360.hkej.com/StockScreener/profession/tab/profile'
driver = webdriver.Chrome()
driver.get(url)
WebDriverWait(driver,10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'option')))
# select the second dropdown option by its value attribute whose value is mb
driver.find_element_by_css_selector('[value=mb]').click()
#wait for blue button to be clickable and click
WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR, '[href*=submit]'))).click()
#select table
table = driver.find_element_by_css_selector('.dt960')
#transfer html of table to pandas read_html which handles tables
df = pd.read_html(table.get_attribute('outerHTML'))[0] #grab the table
df2 = df.drop(df.columns[0], axis=1).dropna(how='all') #lose the nan column and rows
df2.rename(columns=df.iloc[0], inplace = True) #set headers same as row 1
df2.drop(df.index[0], inplace = True) #lose row 1
df2.reset_index(drop=True) #re-index
print(df2)
driver.quit()