我正在尝试一种从this site抓取数据的方法,但是代码不会抓取任何值。
import csv
import os
os.getcwd()
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
options = Options()
options.headless = False
options.add_argument("--window-size=1920,1200")
driver = webdriver.Chrome(options=options, executable_path=r'insert path here')
completeurl = 'https://www.nasdaq.com/market-activity/stocks/MSFT/institutional-holdings'
driver.get(completeurl)
time.sleep(10)
increased_positions = driver.find_element_by_xpath('/html/body/div[2]/div/main/div[2]/div[4]/div[3]/div/div[1]/div/div[1]/div[2]/div/div[2]/div/table/tbody/tr[1]/td[3]')
print(increased_positions.text)
driver.quit()
此代码引发错误。
请帮助,谢谢!
答案 0 :(得分:0)
这是完成任务的简便方法,您可以从数据框列表中获取所有必需的表:-
from selenium import webdriver
import pandas as pd
driver = webdriver.Chrome(options=options, executable_path=r'insert path here')
driver.get("https://www.nasdaq.com/market-activity/stocks/msft/institutional-holdings")
html = driver.page_source
tables = pd.read_html(html)
data = tables[1]
data
driver.quit()