我想使用Python 3自动访问文件。网站为https://www.dax-indices.com/documents/dax-indices/Documents/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls
当您在资源管理器中手动输入url时,它会要求您下载文件,但我想在python中自动执行此操作并将数据作为df加载。
我收到以下错误
URLError:
{{1}}
答案 0 :(得分:0)
$ curl https://www.dax-indices.com/documents/dax-indices/Documents/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>307 Temporary Redirect</title>
</head><body>
<h1>Temporary Redirect</h1>
<p>The document has moved <a href="https://www.dax-indices.com/document/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls">here</a>.</p>
</body></html>
您刚刚被重定向。有多种方法可以在代码中实现,但我只需要将url更改为“ https://www.dax-indices.com/document/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls”
答案 1 :(得分:0)
我在jupyter环境中运行了您的代码,并且成功了。没有提示错误,但数据框只有NaN值。我检查了您尝试读取的xls文件,它似乎不包含任何数据...
还有其他检索xls数据的方法,例如:downloading an excel file from the web in python
import requests
url = 'https://www.dax-indices.com/documents/dax-indices/Documents/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls'
resp = requests.get(url)
output = open('my-sheet.xls', 'wb')
output.write(resp.content)
output.close()
df=pd.read_excel('my-sheet.xls')
print(df.head())
答案 2 :(得分:0)
您可以直接使用熊猫和.read_excel方法
df = pd.read_excel("https://www.dax-indices.com/documents/dax-indices/Documents/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls", sheet_name='Data', skiprows=5)
df.head(1)
答案 3 :(得分:0)
对不起,队友。它可以在我的PC上运行(不是很有帮助的注释)。这是您可以执行的操作的列表->