以下代码从互联网金融门户网站(Morningstar)获取特定数据。我从不同的公司获得数据,在这种情况下来自荷兰公司。每个都由一个自动收报机代表。
import pandas as pd
import numpy as np
def financials_download(ticker,report,frequency):
if frequency == "A" or frequency == "a":
frequency = "12"
elif frequency == "Q" or frequency == "q":
frequency = "3"
url = 'http://financials.morningstar.com/ajax/ReportProcess4CSV.html?&t='+ticker+'®ion=usa&culture=en-US&cur=USD&reportType='+report+'&period='+frequency+'&dataType=R&order=desc&columnYear=5&rounding=3&view=raw&r=640081&denominatorView=raw&number=3'
df = pd.read_csv(url, skiprows=1, index_col=0)
return df
def ratios_download(ticker):
url = 'http://financials.morningstar.com/ajax/exportKR2CSV.html?&callback=?&t='+ticker+'®ion=usa&culture=en-US&cur=USD&order=desc'
df = pd.read_csv(url, skiprows=2, index_col=0)
return df
holland=("AALBF","ABN","AEGOF", "AHODF", "AKZO","ALLVF","AMSYF","ASML","KKWFF","KDSKF","GLPG","GTOFF","HINKF","INGVF","KPN","NN","LIGHT","RANJF","RDLSF","RDS.A","SBFFF", "UNBLF", "UNLVF", "VOPKF", "WOLTF")
def finance(country):
for ticker in country:
frequency = "a"
df1 = financials_download(ticker,'bs',frequency)
df2 = financials_download(ticker,'is',frequency)
df3 = ratios_download(ticker)
d1 = df1.loc['Total assets']
if np.any("EBITDA" in df2.index) == True:
d2 = df2.loc["EBITDA"]
else:
d2 = None
if np.any("Revenue USD Mil" in df3.index) == True:
d3 = df3.loc["Revenue USD Mil"]
else:
d3 = df3.loc["Revenue EUR Mil"]
d4 = df3.loc["Operating Margin %"]
d5 = df3.loc["Return on Assets %"]
d6 = df3.loc["Return on Equity %"]
d7 = df3.loc["EBT Margin"]
d8 = df3.loc["Net Margin %"]
d9 = df3.loc["Free Cash Flow/Sales %"]
if d2 is not None:
d1=d1.to_frame().T
d2=d2.to_frame().T
d3=d3.to_frame().T
d4=d4.to_frame().T
d5=d5.to_frame().T
d6=d6.to_frame().T
d7=d7.to_frame().T
d8=d8.to_frame().T
d9=d9.to_frame().T
df_new=pd.concat([d1,d2,d3,d4,d5,d6,d7,d8,d9])
else:
d1=d1.to_frame().T
d3=d3.to_frame().T
d4=d4.to_frame().T
d5=d5.to_frame().T
d6=d6.to_frame().T
d7=d7.to_frame().T
d8=d8.to_frame().T
d9=d9.to_frame().T
df_new=pd.concat([d1,d3,d4,d5,d6,d7,d8,d9])
df_new.to_csv(ticker+'.csv')
问题在于,当我使用for
循环以便它遍历变量holland
的所有代码并为每个代码生成csv
文档时,它会返回以下错误:
File "pandas/_libs/parsers.pyx", line 565, in
pandas._libs.parsers.TextReader.__cinit__ (pandas\_libs\parsers.c:6260)
EmptyDataError: No columns to parse from file
另一方面,如果我只选择一个公司的股票代码,那么它运行没有错误。
如果你能帮助我,我真的很感激。
答案 0 :(得分:0)
当您多次运行脚本时,它会在不同的代码和不同的调用上失败。这表明问题与特定的自动收报机无关,而是来自csv阅读器的调用不会返回可以读入数据框的值。您可以使用Python的错误处理例程来解决此问题,例如对于financials_download
函数:
df = ""
i = 0
#some data in df?
while len(df) == 0:
#try to download data and load them into df
try:
df = pd.read_csv(url, skiprows=1, index_col=0)
#not successful? Count failed attempts
except:
i += 1
print("Trial", i, "failed")
#five attempts failed? Unlikely that this server will respond
if i == 5:
print("ticker", ticker, ": server is down")
break
#print("downloaded", ticker)
#print("financial download data frame:")
#print(df)
这会尝试五次从自动收报机中检索数据,如果失败,则会输出一条不成功的消息。但是现在你必须在主程序中处理这种情况并进行调整,因为有些数据框是空的。 我想指出这种对a blog post的基本调试。