Question

以下代码从互联网金融门户网站（Morningstar）获取特定数据。我从不同的公司获得数据，在这种情况下来自荷兰公司。每个都由一个自动收报机代表。

import pandas as pd
import numpy as np

def financials_download(ticker,report,frequency):
    if frequency == "A" or frequency == "a":
        frequency = "12"
    elif frequency == "Q" or frequency == "q":
        frequency = "3"
url = 'http://financials.morningstar.com/ajax/ReportProcess4CSV.html?&t='+ticker+'&region=usa&culture=en-US&cur=USD&reportType='+report+'&period='+frequency+'&dataType=R&order=desc&columnYear=5&rounding=3&view=raw&r=640081&denominatorView=raw&number=3'
df = pd.read_csv(url, skiprows=1, index_col=0)
return df


def ratios_download(ticker):
    url = 'http://financials.morningstar.com/ajax/exportKR2CSV.html?&callback=?&t='+ticker+'&region=usa&culture=en-US&cur=USD&order=desc'
    df = pd.read_csv(url, skiprows=2, index_col=0)
    return df


holland=("AALBF","ABN","AEGOF", "AHODF", "AKZO","ALLVF","AMSYF","ASML","KKWFF","KDSKF","GLPG","GTOFF","HINKF","INGVF","KPN","NN","LIGHT","RANJF","RDLSF","RDS.A","SBFFF", "UNBLF", "UNLVF", "VOPKF", "WOLTF")

def finance(country):
  for ticker in country:
        frequency = "a"
        df1 = financials_download(ticker,'bs',frequency)
        df2 = financials_download(ticker,'is',frequency)
        df3 = ratios_download(ticker)


        d1 = df1.loc['Total assets']

        if np.any("EBITDA" in df2.index) == True:
            d2 = df2.loc["EBITDA"]
        else:
            d2 = None

        if np.any("Revenue USD Mil" in df3.index) == True:
            d3 = df3.loc["Revenue USD Mil"]
        else:
            d3 = df3.loc["Revenue EUR Mil"]

        d4 = df3.loc["Operating Margin %"]
        d5 = df3.loc["Return on Assets %"]
        d6 = df3.loc["Return on Equity %"]
        d7 = df3.loc["EBT Margin"]
        d8 = df3.loc["Net Margin %"]
        d9 = df3.loc["Free Cash Flow/Sales %"]

        if d2 is not None:
            d1=d1.to_frame().T
            d2=d2.to_frame().T
            d3=d3.to_frame().T
            d4=d4.to_frame().T
            d5=d5.to_frame().T
            d6=d6.to_frame().T
            d7=d7.to_frame().T
            d8=d8.to_frame().T
            d9=d9.to_frame().T

            df_new=pd.concat([d1,d2,d3,d4,d5,d6,d7,d8,d9])

        else:
            d1=d1.to_frame().T
            d3=d3.to_frame().T
            d4=d4.to_frame().T
            d5=d5.to_frame().T
            d6=d6.to_frame().T
            d7=d7.to_frame().T
            d8=d8.to_frame().T
            d9=d9.to_frame().T    

            df_new=pd.concat([d1,d3,d4,d5,d6,d7,d8,d9])

        df_new.to_csv(ticker+'.csv')

问题在于，当我使用for循环以便它遍历变量holland的所有代码并为每个代码生成csv文档时，它会返回以下错误：

File "pandas/_libs/parsers.pyx", line 565, in
pandas._libs.parsers.TextReader.__cinit__ (pandas\_libs\parsers.c:6260)

EmptyDataError: No columns to parse from file

另一方面，如果我只选择一个公司的股票代码，那么它运行没有错误。

如果你能帮助我，我真的很感激。

Answer 1

当您多次运行脚本时，它会在不同的代码和不同的调用上失败。这表明问题与特定的自动收报机无关，而是来自csv阅读器的调用不会返回可以读入数据框的值。您可以使用Python的错误处理例程来解决此问题，例如对于financials_download函数：

df = ""
i = 0
#some data in df?
while len(df) == 0:
    #try to download data and load them into df
    try:
        df = pd.read_csv(url, skiprows=1, index_col=0)
    #not successful? Count failed attempts
    except:
        i += 1
        print("Trial", i, "failed")
        #five attempts failed? Unlikely that this server will respond
        if i == 5:
            print("ticker", ticker, ": server is down")
            break            
#print("downloaded", ticker)
#print("financial download data frame:")
#print(df)

这会尝试五次从自动收报机中检索数据，如果失败，则会输出一条不成功的消息。但是现在你必须在主程序中处理这种情况并进行调整，因为有些数据框是空的。我想指出这种对a blog post的基本调试。

EmptyDataError：没有要从文件解析的列。（在Python中生成带“for”的文件）

1 个答案:

EmptyDataError：没有要从文件解析的列。 （在Python中生成带“for”的文件）

1 个答案:

EmptyDataError：没有要从文件解析的列。（在Python中生成带“for”的文件）