给定多个条件,将新列应用于第二个df

时间:2019-05-09 16:23:56

标签: python pandas apply

我有两个数据帧,master_sourcemain_df。我想将start_date中的end_datemain_df添加到master_source,因为这最终将使我能够在两个数据帧上设置匹配索引以进行合并。

我的初始逻辑是检查1)两个数据帧中的market是否匹配,以及2)viewed_date中的master_source是否位于start_date和{{1之间end_date中的}}。如果所有条件都已检查完,我想将main_dfstart_date添加到end_date

请注意,master_sourceviewed_datestart_date已全部转换为日期时间对象。

以下是每个数据帧的样本输入:

end_date

master_source

viewed_date market 2019-04-15 Abilene, TX 2019-04-11 Yuma, AZ 2019-04-19 Abilene, TX

main_df

我的代码:

market       start_date   end_date
Abilene, TX  2019-04-11   2019-04-17
Yuma, AZ     2019-04-11   2019-04-17
Abilene, TX  2019-04-18   2019-04-26

到目前为止,我的已知问题是错误def add_dates(row): matches = main_df[ (main_df['market'] == row['market']) & (row['viewed_date'].between(main_df['start_date'], main_df['end_date']))] start = matches['start_date'].values[0] if len(matches) > 0 else None end = matches['end_date'].values[0] if len(matches) > 0 else None row.loc['start_end', 'end_date'] = start, end return row master_source = master_source.apply(add_dates, axis=1) ,而且我觉得我没有正确添加两个新列,而不是一个新列。

1 个答案:

答案 0 :(得分:1)

为开始和结束工作分别进行操作:

def add_start_dates(market, viewed):
    matches = main_df[(main_df['market'] == market)]

    matches2 = matches[(matches['start_date'] <= viewed)&
                       (matches['end_date'] >= viewed)]
    if len(matches2)>0:
        return matches2['start_date'].iloc[0]
    else:
        return viewed

类似于结束日期。

print master_source
print 
print main_df
print
master_source['start_date'] = [add_start_dates(m, v) for m, v in zip(master_source['market'],
                                                               master_source['viewed_date'])]
print master_source

产量:

    market viewed_date
0  abilene  2019-04-15
1     yuma  2019-04-11
2  abilene  2019-04-19

    end_date   market start_date
0 2019-04-17  abilene 2019-04-11
1 2019-04-17     yuma 2019-04-11
2 2019-04-26  abilene 2019-04-18


    market viewed_date start_date
0  abilene  2019-04-15 2019-04-11
1     yuma  2019-04-11 2019-04-11
2  abilene  2019-04-19 2019-04-18