我有以下2个数据框:
day
date val
11740 2016-01-04 1.3970
11741 2016-01-05 1.3991
11742 2016-01-06 1.4084
11743 2016-01-07 1.4061
和
df
Adj_Close Close Date High Low
182 12927.200195 12927.200195 2016-01-04 12928.900391 12748.50
181 12920.099609 12920.099609 2016-01-05 12954.900391 12839.799805
180 12726.799805 12726.799805 2016-01-06 12854.599609 12701.700195
179 12448.200195 12448.200195 2016-01-07 12661.200195 12439.099609
我有一个俗气的循环,通过“加入”共同日期来约会日期以创建新的数据框(new_df)。
new_df = pd.DataFrame(columns=['date', 'close', 'fx', 'usd'])
for indexFx, rowFx in day.iterrows():
for indexSt, rowSt in df.iterrows(): #this is not efficient
fxDate = str(rowFx.date)[:10] #only keep data component not time
if str(rowSt['Date']) == fxDate:
dateObj = datetime.strptime(rowSt.Date,'%Y-%m-%d')
row = [dateObj, rowSt.Close,rowFx.val, float(rowSt.Close) * float(rowFx.val)]
new_df.loc[len(new_df)] = row
我知道有一种有效的方法来Python化这个循环。有人可以帮忙吗?
谢谢
答案 0 :(得分:1)
pd.concat([day.set_index('date'), df.set_index('Date')], axis=1)
>>>
val Adj_Close Close High \
2016-01-04 1.3970 12927.200195 12927.200195 12928.900391
2016-01-05 1.3991 12920.099609 12920.099609 12954.900391
2016-01-06 1.4084 12726.799805 12726.799805 12854.599609
2016-01-07 1.4061 12448.200195 12448.200195 12661.200195
Low
2016-01-04 12748.500000
2016-01-05 12839.799805
2016-01-06 12701.700195
2016-01-07 12439.099609
根据您是否需要内部或外部联接,您可以使用join='inner'
或join='outer'
指定该联接。