刷新数据框时更新字段的计算-Pandas

时间:2019-12-09 15:00:46

标签: python python-3.x pandas dataframe

我有一个看起来像这样的数据框:

d = {'ID': [0, 1, 2, 3, 4], 
     'm1': ['2019-12-06', '2019-12-07','2019-12-07', '2019-12-06', '2020-12-09'], 
     'm2': ['2019-12-07', None, None, '2019-12-07', None], 
     'm3': [None, None, None, '2019-12-09', None],
     'm1_m2': [1, 1, 2, 2, 3],
     'm2_m3': [3, 3, 4, 1, 2]}

dat = pd.DataFrame(d)

打印(日期)

   ID          m1          m2          m3  m1_m2  m2_m3
0   0  2019-12-06  2019-12-07        None      1      3
1   1  2019-12-07        None        None      1      3
2   2  2019-12-07        None        None      2      4
3   3  2019-12-06  2019-12-07  2019-12-09      2      1
4   4  2020-12-09        None        None      3      2

我想创建2个新字段,分别估算m2和m3。

每当我没有m2和m3时,都会计算

m2_estimated和m3_estimated

预期输出为:

ID          m1          m2          m3     m1_m2    m2_m3   m2_estimated    m3_estimated
0   2019-12-06  2019-12-07        None         1        3           None      2019-12-10
1   2019-12-07        None        None         1        3     2019-12-08      2019-12-11
2   2019-12-07        None        None         2        4     2019-12-09      2019-12-13
3   2019-12-06  2019-12-07  2019-12-09         2        1           None            None
4   2020-12-09        None        None         3        2     2019-12-12      2019-12-14

这里的逻辑很简单,我想将m2 + m2_m3加起来得到m3_estimated

2 个答案:

答案 0 :(得分:1)

df['m2_estimated'] = pd.to_datetime(df['m1']) + pd.to_timedelta(df['m1_m2'], unit='D')

如果您不想使用 dt 访问器设置日期时间,可以将其设置为日期:

df['m2_estimated'] = df['m2_estimated'].dt.date

答案 1 :(得分:1)

df['m2_estimated'] = pd.to_datetime(df['m1']) + df['m1_m2']

上面的代码就足够了。您必须确保m1_m2为整数格式。