Question

我有一个班级任务，编写一个python程序，以下载Yahoo Finance过去25年的主要全球股市指数的日末数据：

道琼斯指数（美国）
标准普尔500（美国）
纳斯达克（美国）
DAX（德国）
FTSE（英国）
HANGSENG（Hong Kong）
KOSPI（韩国）
CNX NIFTY（印度）

不幸的是，当我运行程序时，发生错误。

第91行，格式为“ C：\ ProgramData \ Anaconda2 \ lib \ site-packages \ yahoofinancials__init __。py”的文件，格式为date       form_date = datetime.datetime.fromtimestamp（int（in_date））。strftime（'％Y-％m-％d'）

ValueError：平台localtime（）/ gmtime（）函数的时间戳超出范围

如果在下面看到，则可以看到我编写的代码。我正在尝试调试我的错误。你能帮我吗？谢谢

from yahoofinancials import YahooFinancials
import pandas as pd

# Select Tickers and stock history dates
index1 = '^DJI'
index2 = '^GSPC'
index3 = '^IXIC'
index4 = '^GDAXI'
index5 = '^FTSE'
index6 = '^HSI'
index7 = '^KS11'
index8 = '^NSEI'
freq = 'daily'
start_date = '1993-06-30'
end_date = '2018-06-30'


# Function to clean data extracts
def clean_stock_data(stock_data_list):
    new_list = []
    for rec in stock_data_list:
        if 'type' not in rec.keys():
            new_list.append(rec)
    return new_list

# Construct yahoo financials objects for data extraction
dji_financials = YahooFinancials(index1)
gspc_financials = YahooFinancials(index2)
ixic_financials = YahooFinancials(index3)
gdaxi_financials = YahooFinancials(index4)
ftse_financials = YahooFinancials(index5)
hsi_financials = YahooFinancials(index6)
ks11_financials = YahooFinancials(index7)
nsei_financials = YahooFinancials(index8)


# Clean returned stock history data and remove dividend events from price history
daily_dji_data = clean_stock_data(dji_financials
                                     .get_historical_stock_data(start_date, end_date, freq)[index1]['prices'])
daily_gspc_data = clean_stock_data(gspc_financials
                                     .get_historical_stock_data(start_date, end_date, freq)[index2]['prices'])
daily_ixic_data = clean_stock_data(ixic_financials
                                     .get_historical_stock_data(start_date, end_date, freq)[index3]['prices'])
daily_gdaxi_data = clean_stock_data(gdaxi_financials
                                     .get_historical_stock_data(start_date, end_date, freq)[index4]['prices'])                                   
daily_ftse_data = clean_stock_data(ftse_financials
                                     .get_historical_stock_data(start_date, end_date, freq)[index5]['prices'])                         
daily_hsi_data = clean_stock_data(hsi_financials
                                     .get_historical_stock_data(start_date, end_date, freq)[index6]['prices'])
daily_ks11_data = clean_stock_data(ks11_financials
                                     .get_historical_stock_data(start_date, end_date, freq)[index7]['prices'])
daily_nsei_data = clean_stock_data(nsei_financials
                                     .get_historical_stock_data(start_date, end_date, freq)[index8]['prices'])
stock_hist_data_list = [{'^DJI': daily_dji_data}, {'^GSPC': daily_gspc_data}, {'^IXIC': daily_ixic_data},
                        {'^GDAXI': daily_gdaxi_data}, {'^FTSE': daily_ftse_data}, {'^HSI': daily_hsi_data},
                        {'^KS11': daily_ks11_data}, {'^NSEI': daily_nsei_data}]


# Function to construct data frame based on a stock and it's market index
def build_data_frame(data_list1, data_list2, data_list3, data_list4, data_list5, data_list6, data_list7, data_list8):
    data_dict = {}
    i = 0
    for list_item in data_list2:
        if 'type' not in list_item.keys():
            data_dict.update({list_item['formatted_date']: {'^DJI': data_list1[i]['close'], '^GSPC': list_item['close'],
                                                            '^IXIC': data_list3[i]['close'], '^GDAXI': data_list4[i]['close'],
                                                            '^FTSE': data_list5[i]['close'], '^HSI': data_list6[i]['close'],     
                                                            '^KS11': data_list7[i]['close'], '^NSEI': data_list8[i]['close']}})
            i += 1
    tseries = pd.to_datetime(list(data_dict.keys()))
    df = pd.DataFrame(data=list(data_dict.values()), index=tseries,
                      columns=['^DJI', '^GSPC', '^IXIC', '^GDAXI', '^FTSE', '^HSI', '^KS11', '^NSEI']).sort_index()
    return df

Answer 1

您的问题是您的日期时间戳格式错误。如果您查看错误代码，就会很清楚地告诉您：

datetime.datetime.fromtimestamp(int(in_date)).strftime('%Y-%m-%d')

注意到int(in_date)部分吗？

它需要unix时间戳。有多种方法可以通过时间模块或日历模块或使用Arrow来实现。

import datetime
import calendar

date = datetime.datetime.strptime("1993-06-30", "%Y-%m-%d")
start_date = calendar.timegm(date.utctimetuple())

*更新* 好的，所以我修复了dataframes部分。这是我当前的代码：

# Select Tickers and stock history dates
index = {'DJI' : YahooFinancials('^DJI'),
         'GSPC' : YahooFinancials('^GSPC'),
         'IXIC':YahooFinancials('^IXIC'),
         'GDAXI':YahooFinancials('^GDAXI'),
         'FTSE':YahooFinancials('^FTSE'),
         'HSI':YahooFinancials('^HSI'),
         'KS11':YahooFinancials('^KS11'),
         'NSEI':YahooFinancials('^NSEI')}
freq = 'daily'
start_date = '1993-06-30'
end_date = '2018-06-30'

# Clean returned stock history data and remove dividend events from price history
daily = {}
for k in index:
    tmp = index[k].get_historical_stock_data(start_date, end_date, freq)
    if tmp:
        daily[k] = tmp['^{}'.format(k)]['prices'] if 'prices' in tmp['^{}'.format(k)] else []

不幸的是，我不得不在yahoo模块中修复了几件事。对于YahooFinanceETL类：

@staticmethod
def format_date(in_date, convert_type):
    try:
        x = int(in_date)
        convert_type = 'standard'
    except:
        convert_type = 'unixstamp'
    if convert_type == 'standard':
        if in_date < 0:
            form_date = datetime.datetime(1970, 1, 1) + datetime.timedelta(seconds=in_date)
        else:
            form_date = datetime.datetime.fromtimestamp(int(in_date)).strftime('%Y-%m-%d')
    else:
        split_date = in_date.split('-')
        d = date(int(split_date[0]), int(split_date[1]), int(split_date[2]))
        form_date = int(time.mktime(d.timetuple()))
    return form_date

AND：

   # private static method to scrap data from yahoo finance
    @staticmethod
    def _scrape_data(url, tech_type, statement_type):
        response = requests.get(url)
        soup = BeautifulSoup(response.content, "html.parser")
        script = soup.find("script", text=re.compile("root.App.main")).text
        data = loads(re.search("root.App.main\s+=\s+(\{.*\})", script).group(1))
        if tech_type == '' and statement_type != 'history':
            stores = data["context"]["dispatcher"]["stores"]["QuoteSummaryStore"]
        elif tech_type != '' and statement_type != 'history':
            stores = data["context"]["dispatcher"]["stores"]["QuoteSummaryStore"][tech_type]
        else:
            if "HistoricalPriceStore" in data["context"]["dispatcher"]["stores"] :
                stores = data["context"]["dispatcher"]["stores"]["HistoricalPriceStore"]
            else:
                stores = data["context"]["dispatcher"]["stores"]["QuoteSummaryStore"]
        return stores

您将要查看daily字典，并重写build_data_frame函数，由于您已经在使用字典，因此现在应该更简单了。

Answer 2

我实际上是YahooFinancials的维护者和作者。我刚刚看到了这篇文章，并想对由此给您带来的不便深表歉意，并让大家知道我将在今晚修复模块。

您能否在模块的Github页面上打开一个问题详细说明？遇到这些问题时，知道您正在运行哪个版本的python也将非常有帮助。 https://github.com/JECSand/yahoofinancials/issues

我现在正在上班，但是，一旦我在大约7个小时之内回到家，我将尝试编写补丁并将其发布。我还将处理异常处理。我尽力维护此模块，但是我的白天（通常是夜间）的工作要求很高。我将报告这些修复程序的最终结果，并在完成并稳定后发布到pypi。

如果其他人有任何反馈或您可以提供的个人修复，则对解决此问题将提供巨大的帮助。适当的功劳当然会给予。我也非常需要贡献者，所以如果有人对此感兴趣，也请告诉我。我真的很想将YahooFinancials提升到一个新的水平，并使这个项目成为python项目免费财务数据的稳定可靠的替代方案。

感谢您的耐心配合和使用YahooFinancials。

ValueError：时间戳超出平台localtime（）/ gmtime（）函数的范围

2 个答案: