python合并从for循环创建的新列

时间:2020-09-30 02:36:40

标签: python pandas dataframe

我想将从for循环生成的列合并到主数据帧。问题是生成列和主df都有不同的日期时间。这本来很简单,但我遇到了一些问题。 Main df具有很多天的日期时间数据,频率为1小时。我有一个for循环来生成1s频率的新数据。我想将此数据附加到主df中的新列。我遇到了一些错误。

我的代码:

main_df = 
                       col_A
2019-10-01 09:19:40    10
2019-10-02 09:20:15    20
main_df['new_col'] = ""
for i in main_df.index.date.unique():
   aux_df = day_fun(i) # day_fun generates 1s frequency data under new_col
   # aux_df index is datetime, column is new_col
   main_df['new_col'].loc[i] = pd.concat([main_df, aux_df], axis=1).reindex(main_df.index)
   df['new_col'].loc[i] = aux_df['new_col']

    #for example, aux_df will have following data for 
    # for i='2019-10-01'
    #                      new_col
    #2019-10-01 09:19:40    100
    #2019-10-01 09:21:29    200
    # for i='2019-10-02'
    #                      new_col
    #2019-10-02 09:20:15    300
    #2019-10-02 09:22:39    400

当前输出:

 raise ValueError("Incompatible indexer with DataFrame")

ValueError: Incompatible indexer with DataFrame

预期输出:

main_df = 
                       col_A   new_col
2019-10-01 09:19:40    10      100
2019-10-02 09:20:15    20      300

1 个答案:

答案 0 :(得分:2)

使用mergeasof。以防万一,请记住将日期强制为datetime

#main_df['date']=pd.to_datetime(main_df['date'])
main_df=main_df.set_index('date').drop(columns=['d','t'])
  print(main_df)

                      col_A
date                      
2019-10-01 09:19:40     10
2019-10-02 09:20:15     20

#aux_df['date']=pd.to_datetime(aux_df['date'])
aux_df=aux_df.set_index('date').drop(columns=['d','t'])
    print(aux_df)

                      new_col
date                        
2019-10-01 09:19:40      100
2019-10-01 09:21:29      200
2019-10-02 09:20:15      300
2019-10-02 09:22:39      400

解决方案

# Append one day artifical data to the main_df
for i in main_df.index.strftime('%Y-%m-%d').unique():
    aux_df = day_fun(i) # generate artifical data for ith day
    main_df.loc[i] = pd.merge_asof(main_df.loc[i], aux_df, left_index=True, right_index=True)

     

                      col_A  new_col
      date                               
2019-10-01 09:19:40     10      100
2019-10-02 09:20:15     20      300