检查HTTP状态(Python)

时间:2015-06-24 19:31:39

标签: python

有没有办法在下面的代码中检查HTTP状态代码,因为我没有使用允许这样做的requesturllib库。

from pandas.io.excel import read_excel

url = 'http://www.bankofengland.co.uk/statistics/Documents/yieldcurve/uknom05_mdaily.xls'

#check the sheet number, spot: 9/9, short end 7/9
spot_curve = read_excel(url, sheetname=8) #Creates the dataframes
short_end_spot_curve = read_excel(url, sheetname=6)

# do some cleaning, keep NaN for now, as forward fill NaN is not recommended for yield curve
spot_curve.columns = spot_curve.loc['years:']
valid_index = spot_curve.index[4:]
spot_curve = spot_curve.loc[valid_index]
# remove all maturities within 5 years as those are duplicated in short-end file
col_mask = spot_curve.columns.values > 5
spot_curve = spot_curve.iloc[:, col_mask]
#Providing correct names
short_end_spot_curve.columns = short_end_spot_curve.loc['years:']
valid_index = short_end_spot_curve.index[4:]
short_end_spot_curve = short_end_spot_curve.loc[valid_index]

# merge these two, time index are identical
# ==============================================
combined_data = pd.concat([short_end_spot_curve, spot_curve], axis=1, join='outer')
# sort the maturity from short end to long end
combined_data.sort_index(axis=1, inplace=True)

def filter_func(group):
    return group.isnull().sum(axis=1) <= 50

combined_data = combined_data.groupby(level=0).filter(filter_func)

1 个答案:

答案 0 :(得分:0)

pandas中: read_excel尝试使用urllib2.urlopenurllib.request.urlopen代替py3x)打开网址并立即获取.read()响应,而不存储http请求,如:

data = urlopen(url).read()

虽然您只需要部分Excel,pandas每次都会下载整个Excel 。所以,我投了@jonnybazookatone。

将excel存储到本地更好,然后您可以先检查文件的状态代码和md5,以验证数据的完整性或其他。