我正在测试以下代码,以从Marketchameleon.com抓取一些Options data。原始表是按ATM IV%更改排序的,但是我想按“隐含跨度溢价”列进行排序。因为这不是单击按钮的方法,但我(在检查HTML源代码之后)这样做是这样的:
from selenium import webdriver
from selenium.webdriver.common.by import By
from bs4 import BeautifulSoup as BSoup
browser = webdriver.PhantomJS()
browser.get("https://marketchameleon.com/Screeners/Options")
bs_obj = BSoup(browser.page_source, 'html.parser').encode("utf-8")
with open("Market_Chameleon_Unsorted.html", "w") as file:
file.write(str(bs_obj))
element = browser.find_element_by_xpath("//th[@aria-label='Implied StraddlePremium %: activate to sort column ascending']")
browser.execute_script("arguments[0].setAttribute('aria-label','Implied StraddlePremium %: activate to sort column descending')", element)
bs_obj = BSoup(browser.page_source, 'html.parser').encode("utf-8")
with open("Market_Chameleon_Sorted.html", "w") as file:
file.write(str(bs_obj))
代码运行时没有任何错误,但是它不会对表进行排序,即未排序的表和已排序的表是相同的(我解析R中的HTML文件)。似乎在javascript修改了html之后,页面并未真正刷新。如果我进行正常的刷新,则会再次获得未排序表的原始html。如何解释Selenium对表进行排序?还有另一种方法吗?