我是python编码的新手,我绝对喜欢它!不幸的是,我对它的有限了解使我一直按照我所遵循的教程中的一段代码遇到障碍,请参见下面的链接:
1)使用bs4(完成)
从所有SP500公司的维基百科复制行情清单。2)使用 pandas_datareader 从Yahoo的所有股票行情获取数据,并将所有带有OHLC数据的SP500公司分别导入到文件夹中的csv文件中(称为 stock_dfs )( >已完成)
Yahoo在大约70条回合后阻止了我...一个建议会很棒!...我尝试导入时间并使用 3)将所有股票行情数据合并到一个可供分析的主文件中...我只是无法将它们合并。我什至尝试手动创建csv,但还是没有。 注意:在网站上的代码中,他要求输入 morningstar 数据,而不是在放置yahoo的视频中输入 yahoo 。我认为这样做是错误的。无论哪种方式,他都可以在3.5上运行。所以我假设是版本问题。 谢谢! 在下面,您将找到运行此错误消息以及随后的代码段。time.sleep
创建一个5秒的延迟,但是无论我将其放置在循环中的哪个位置,雅虎都将我拒之门外。
Traceback (most recent call last):
File "C:/Users/harry/PycharmProjects/Tutorials/Finance with Python/SENTDEX_T7_sp500InOneDataframe.py", line 87, in <module>
compile_data()
File "C:/Users/harry/PycharmProjects/Tutorials/Finance with Python/SENTDEX_T7_sp500InOneDataframe.py", line 70, in compile_data
df = pd.read_csv('stock_dfs/{}.csv'.format(ticker))
File "C:\Users\harry\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 702, in parser_f
return _read(filepath_or_buffer, kwds)
File "C:\Users\harry\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 429, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "C:\Users\harry\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 895, in __init__
self._make_engine(self.engine)
File "C:\Users\harry\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 1122, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "C:\Users\harry\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 1853, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 387, in pandas._libs.parsers.TextReader.__cinit__
File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File b'stock_dfs/BRK.B.csv' does not exist: b'stock_dfs/BRK.B.csv'
Process finished with exit code 1
import bs4 as bs
import datetime as dt
import os
import pandas as pd
import pandas_datareader.data as web
import pickle
import requests
def save_sp500_tickers():
resp = requests.get('http://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
soup = bs.BeautifulSoup(resp.text, 'lxml')
table = soup.find('table', {'class': 'wikitable sortable'})
tickers = []
for row in table.findAll('tr')[1:]:
ticker = row.findAll('td')[0].text
tickers.append(ticker)
with open("sp500tickers.pickle", "wb") as f:
pickle.dump(tickers, f)
return tickers
# save_sp500_tickers()
def get_data_from_yahoo(reload_sp500=False):
if reload_sp500:
tickers = save_sp500_tickers()
else:
with open("sp500tickers.pickle", "rb") as f:
tickers = pickle.load(f)
if not os.path.exists('stock_dfs'):
os.makedirs('stock_dfs')
start = dt.datetime(2010, 1, 1)
end = dt.datetime.now()
for ticker in tickers:
# just in case your connection breaks, we'd like to save our progress!
if not os.path.exists('stock_dfs/{}.csv'.format(ticker)):
df = web.DataReader(ticker, 'yahoo', start, end)
df.reset_index(inplace=True)
df.set_index("Date", inplace=True)
df = df.drop("Symbol", axis=1)
df.to_csv('stock_dfs/{}.csv'.format(ticker))
else:
print('Already have {}'.format(ticker))
def compile_data():
with open("sp500tickers.pickle", "rb") as f:
tickers = pickle.load(f)
main_df = pd.DataFrame()
for count, ticker in enumerate(tickers):
df = pd.read_csv('stock_dfs/{}.csv'.format(ticker))
df.set_index('Date', inplace=True)
df.rename(columns={'Adj Close': ticker}, inplace=True)
df.drop(['Open', 'High', 'Low', 'Close', 'Volume'], 1, inplace=True)
if main_df.empty:
main_df = df
else:
main_df = main_df.join(df, how='outer')
if count % 10 == 0:
print(count)
print(main_df.head())
main_df.to_csv('sp500_joined_closes.csv')
compile_data()