我正在进行股票预测项目。我想从Yahoo Finance下载历史数据并将其保存为CSV格式。
由于我是Python的初学者,因此无法纠正错误。
我的代码如下:
import re
import urllib2
import calendar
import datetime
import getopt
import sys
import time
crumble_link = 'https://finance.yahoo.com/quote/{0}/history?p={0}'
crumble_regex = r'CrumbStore":{"crumb":"(.*?)"}'
cookie_regex = r'Set-Cookie: (.*?); '
quote_link = 'https://query1.finance.yahoo.com/v7/finance/download/{}?period1={}&period2={}&interval=1d&events=history&crumb={}'
def get_crumble_and_cookie(symbol):
link = crumble_link.format(symbol)
response = urllib2.urlopen(link)
match = re.search(cookie_regex, str(response.info()))
cookie_str = match.group(1)
text = response.read()
match = re.search(crumble_regex, text)
crumble_str = match.group(1)
return crumble_str, cookie_str
def download_quote(symbol, date_from, date_to):
time_stamp_from = calendar.timegm(datetime.datetime.strptime(date_from, "%Y-%m-%d").timetuple())
time_stamp_to = calendar.timegm(datetime.datetime.strptime(date_to, "%Y-%m-%d").timetuple())
attempts = 0
while attempts < 5:
crumble_str, cookie_str = get_crumble_and_cookie(symbol)
link = quote_link.format(symbol, time_stamp_from, time_stamp_to, crumble_str)
#print link
r = urllib2.Request(link, headers={'Cookie': cookie_str})
try:
response = urllib2.urlopen(r)
text = response.read()
print "{} downloaded".format(symbol)
return text
except urllib2.URLError:
print "{} failed at attempt # {}".format(symbol, attempts)
attempts += 1
time.sleep(2*attempts)
return ""
if __name__ == '__main__':
print get_crumble_and_cookie('KO')
from_arg = "from"
to_arg = "to"
symbol_arg = "symbol"
output_arg = "o"
opt_list = (from_arg+"=", to_arg+"=", symbol_arg+"=")
try:
options, args = getopt.getopt(sys.argv[1:],output_arg+":",opt_list)
except getopt.GetoptError as err:
print err
for opt, value in options:
if opt[2:] == from_arg:
from_val = value
elif opt[2:] == to_arg:
to_val = value
elif opt[2:] == symbol_arg:
symbol_val = value
elif opt[1:] == output_arg:
output_val = value
print "downloading {}".format(symbol_val)
text = download_quote(symbol_val, from_val, to_val)
with open(output_val, 'wb') as f:
f.write(text)
print "{} written to {}".format(symbol_val, output_val)
我收到的错误消息是:
File "C:/Users/Murali/PycharmProjects/generate/venv/tcl/generate2.py", line
49, in <module>
print get_crumble_and_cookie('KO')
File "C:/Users/Murali/PycharmProjects/generate/venv/tcl/generate2.py", line
19, in get_crumble_and_cookie
cookie_str = match.group(1)
AttributeError: 'NoneType' object has no attribute 'group'
那么我们如何解决这个突然出现的问题?
答案 0 :(得分:0)
看看这两个命令:
match = re.search(cookie_regex, str(response.info()))
cookie_str = match.group(1)
第一个使用字符串response.info()
进行正则表达式搜索以匹配cookie_regex
。然后match.group(1)
应该从中获得比赛。但是,问题在于,如果您在这两个命令之间执行print match
,则会看到re.search()
没有返回任何内容。这意味着match.group()没有什么可“分组”的,这就是为什么它会出错。
如果您仔细查看response.info()
(您可以在脚本中添加一个print response.info()
命令来查看它),您会发现响应代码中有一行以“ set-cookie:”,您要尝试捕获的代码。但是,您已设置cookie_regex
字符串以查找带有“ S et- C ookie:”的行。注意大写字母。当我将该字符串更改为小写时,错误消失了:
cookie_regex = r'set-cookie: (.*?); '
此后,我确实遇到了另一个错误,由于未定义print "downloading {}".format(symbol_val)
,symbol_val
停止了。看来此变量仅在opt[2:] == symbol_arg:
时声明和分配。因此,您可能需要重写该部分以涵盖所有情况。