这是我的python代码:
import pandas as pd
import pandas_datareader.data as web
import bs4 as bs
import urllib.request as ul
from selenium import webdriver
style.use('ggplot')
driver = webdriver.PhantomJS(executable_path='C:\\Phantomjs\\bin\\phantomjs.exe')
def getBondRate():
#driver.deleteAllCookies();
url = "https://www.marketwatch.com/investing/index/tnx?countrycode=xx"
driver.get(url)
driver.implicitly_wait(10)
html = driver.page_source
return html
bondRate = getBondRate()
print(bondRate)
几天前,它从Market watch上阅读得很好。现在,它在Body标签中什么也不返回。硒不加载页面吗?
答案 0 :(得分:0)
您还需要HTML标签吗?如果不是,您可以尝试使用body标签进行检索。这就是我使用Java的方式。
String src=driver.findElement(By.tagName("body")).getText();
答案 1 :(得分:0)
根据网址https://www.marketwatch.com/investing/index/tnx?countrycode=xx
,您观察到的行为非常合理。
我已经处理了您的代码,并进行了一次简单的调整,尝试使用 PhantomJS 和 ChromeDriver 提取page_source
。可以看到,当您使用任何 WebDriver 变体时,都会检测到 WebDriver 指纹,并且出现 Fingerprinting error
提出如下:
错误详细信息:
Failed to load resource: the server responded with a status of 404 (Not Found)
kpf.js?url=/149e9513-01fa-4fb0-aad4-566afd725d1b/2d206a39-8ed7-437e-a3be-862e0f06eea3/fingerprint&token=058cbc6a-f8b8-f175-ca68-8c2e0fd6a4e3:1 Fingerprinting error
name: Error
message: Error issuing AJAX request (status code: 404)
stack: Error: Error issuing AJAX request (status code: 404)
at XMLHttpRequest.N.a.onreadystatechange (https://www.marketwatch.com/149e9513-01fa-4fb0-aad4-566afd725d1b/2d206a39-8ed7-437e-a3be-862e0f06eea3/fingerprint/script/kpf.js?url=/149e9513-01fa-4fb0-aad4-566afd725d1b/2d206a39-8ed7-437e-a3be-862e0f06eea3/fingerprint&token=058cbc6a-f8b8-f175-ca68-8c2e0fd6a4e3:1:1884)
DevTools failed to parse SourceMap: https://www.marketwatch.com/149e9513-01fa-4fb0-aad4-566afd725d1b/2d206a39-8ed7-437e-a3be-862e0f06eea3/fingerprint/script/fingerprint.js.map
DevTools快照: