我想将从for循环生成的列合并到主数据帧。问题是生成列和主df都有不同的日期时间。这本来很简单,但我遇到了一些问题。 Main df具有很多天的日期时间数据,频率为1小时。我有一个for循环来生成1s频率的新数据。我想将此数据附加到主df中的新列。我遇到了一些错误。
我的代码:
main_df =
col_A
2019-10-01 09:19:40 10
2019-10-02 09:20:15 20
main_df['new_col'] = ""
for i in main_df.index.date.unique():
aux_df = day_fun(i) # day_fun generates 1s frequency data under new_col
# aux_df index is datetime, column is new_col
main_df['new_col'].loc[i] = pd.concat([main_df, aux_df], axis=1).reindex(main_df.index)
df['new_col'].loc[i] = aux_df['new_col']
#for example, aux_df will have following data for
# for i='2019-10-01'
# new_col
#2019-10-01 09:19:40 100
#2019-10-01 09:21:29 200
# for i='2019-10-02'
# new_col
#2019-10-02 09:20:15 300
#2019-10-02 09:22:39 400
当前输出:
raise ValueError("Incompatible indexer with DataFrame")
ValueError: Incompatible indexer with DataFrame
预期输出:
main_df =
col_A new_col
2019-10-01 09:19:40 10 100
2019-10-02 09:20:15 20 300
答案 0 :(得分:2)
使用mergeasof
。以防万一,请记住将日期强制为datetime
#main_df['date']=pd.to_datetime(main_df['date'])
main_df=main_df.set_index('date').drop(columns=['d','t'])
print(main_df)
col_A
date
2019-10-01 09:19:40 10
2019-10-02 09:20:15 20
#aux_df['date']=pd.to_datetime(aux_df['date'])
aux_df=aux_df.set_index('date').drop(columns=['d','t'])
print(aux_df)
new_col
date
2019-10-01 09:19:40 100
2019-10-01 09:21:29 200
2019-10-02 09:20:15 300
2019-10-02 09:22:39 400
解决方案
# Append one day artifical data to the main_df
for i in main_df.index.strftime('%Y-%m-%d').unique():
aux_df = day_fun(i) # generate artifical data for ith day
main_df.loc[i] = pd.merge_asof(main_df.loc[i], aux_df, left_index=True, right_index=True)
col_A new_col
date
2019-10-01 09:19:40 10 100
2019-10-02 09:20:15 20 300