我正在尝试从链接http://money.cnn.com/data/fear-and-greed/中检索恐惧索引。索引是动态变化的。当我检查元素时,它显示下面的编码。我只是想知道如何使用python Selenium来获取84和其他索引?我试着使用下面的代码,但只是空白。有什么想法吗?
select mname
from medication
where mname like 'A%'
and mname like 'Y';
以下是网页代码
cr = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH,"//*[contains(text(), 'Fear & Greed Now')]")))
答案 0 :(得分:2)
根据specification,headers
默认情况下只会为您提供呈现的文字,我怀疑这是因为“奇怪的造型”而变空了needleChart“父容器。
您需要使用.text
代替innerHTML
来解决“空文”问题:
.text
打印:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox()
driver.get("http://money.cnn.com/data/fear-and-greed/")
driver.maximize_window()
wait = WebDriverWait(driver, 10)
list_indexes = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "#needleChart")))
indexes = list_indexes.find_elements_by_tag_name("li")
for index in indexes:
print(index.get_attribute("innerHTML"))
driver.close()
然后,您可以对这些文本进行后期处理并创建一个好的结果字典,将句点作为键提取,将索引作为值提取:
Fear & Greed Now: 86 (Extreme Greed)
Fear & Greed Previous Close: 86 (Extreme Greed)
Fear & Greed 1 Week Ago: 89 (Extreme Greed)
Fear & Greed 1 Month Ago: 57 (Greed)
Fear & Greed 1 Year Ago: 16 (Extreme Fear)
打印:
import re
pattern = re.compile(r"^Fear & Greed (.*?): (\d+)")
d = dict(pattern.search(index.get_attribute("innerHTML")).groups() for index in indexes)
print(d)
答案 1 :(得分:1)
您可以通过查找元素并提取其innerHTML文本来找到它:
driver.findElement(By.linkText("Create an account!")).click();
文本将包含以下所有文字:
element = webdriver.find_element_by_xpath("//div[@id='needleChart']/ul/li")
text = element.get_attribute("innerHTML")
然后您可以使用正则表达式从上面的字符串中提取贪婪索引。
答案 2 :(得分:0)
尝试如下: -
elements = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID,"needleChart"))).find_elements_by_tag_name("li")
for li in elements:
text = li.get_attribute("innerHTML")
s = ''.join(x for x in text if x.isdigit())
print(s)
希望它有帮助...:)