根据 id 熊猫更新列

时间:2021-01-13 02:34:23

标签: python-3.x pandas dataframe series

df_2:

order_id   date        amount name   interval is_sent
123        2020-01-02  3      white  today    false
456        NaT         2      blue   weekly   false
789        2020-10-11  0      red    monthly  false
135        2020-6-01   3      orange weekly   false

我正在合并两个数据框,当日期大于前一个结果时定位,并查看数据类型是否已更改:

df_1['date'] = pd.to_datetime(df_1['date'])
df_2['date'] = pd.to_datetime(df_2['date'])
res = df_1.merge(df_2, on='order_id', suffixes=['_orig', ''])
m = res['date'].gt(res['date_orig']) | (res['date_orig'].isnull() & res['date'].notnull())
changes_df = res.loc[m, ['order_id', 'date', 'amount', 'name', 'interval', 'is_sent']]

找到所有实体后,我将 changes_df['is_sent'] 更改为 true:

changes_df['is_sent'] = True

在上面运行之后changes_df是:

order_id   date        amount name   interval is_sent
123        2020-01-03  3      white  today    true
456        2020-12-01  2      blue   weekly   true
135        2020-6-02   3      orange weekly   true

然后我只想将 df_2['date']df_2['is_sent'] 中的值更新为等于 changes_df['date']changes_df['is_sent']

非常感谢任何见解。

2 个答案:

答案 0 :(得分:0)

df3 = df2.combine_first(
    cap_df1).reindex(df.index)

这是我的解决方案

答案 1 :(得分:0)

让我们用 update 试试 set_index

cf = changes_df[['order_id','date','is_sent']].set_index('order_id')
df_2 = df_2.set_index('order_id')
df_2.update(cf)
df_2.reset_index(inplace=True)
df_2
   order_id        date  amount    name interval is_sent
0       123  2020-01-03       3   white    today    True
1       456  2020-12-01       2    blue   weekly    True
2       789  2020-10-11       0     red  monthly   False
3       135   2020-6-02       3  orange   weekly    True