我正在尝试创建一个数据帧,其中包含来自2个不同数据框的列。
import pandas as pd
import numpy as np
from statsmodels import api as sm
import pandas_datareader.data as web
import datetime
start = datetime.datetime(2016,12,2)
end = datetime.datetime.today()
df = web.get_data_yahoo(['F'], start, end)
df1 = web.get_data_yahoo(['^GSPC'], start, end)
df3 = pd.concat([df['Adj Close'], df1['Adj Close']])
为此,我希望获得df3
,其中两列包含[Adj Close]的数据。相反,我得到的是:
F ^GSPC
Date
2016-12-01 10.297861 NaN
2016-12-02 10.140451 NaN
2016-12-05 10.306145 NaN
2016-12-06 10.405562 NaN
2016-12-07 10.819797 NaN
... ... ...
2019-11-22 NaN 3110.290039
2019-11-25 NaN 3133.639893
2019-11-26 NaN 3140.520020
2019-11-27 NaN 3153.629883
2019-11-29 NaN 3140.979980
1508 rows × 2 columns
我该怎么做才能摆脱NaN值,为什么它在那里?
答案 0 :(得分:1)
添加参数axis=1
以并用concat
中的列:
df3 = pd.concat([df['Adj Close'], df1['Adj Close']], axis=1)
但是我认为您的解决方案应该简化为get_data_yahoo
的通过列表:
df3 = web.get_data_yahoo(['F', '^GSPC'], start, end)