熊猫-根据日期差异返回x列

时间:2018-12-11 09:37:14

标签: python pandas

我有一列['A'],上面有很多日期:

df['A'] = ['3/31/2018', '6/22/2018', '7/5/2018',...]

我还有一个由月末组成的日期范围:

rng = pd.date_range('1/31/2019', periods=36, freq='M')

我想根据计算返回36列:

rng - df['A']

我开始执行以下操作,但我知道这样做效率不高:

df['d1'] = pd.to_datetime('1/31/2019')
df['d2'] = df['d1'] + MonthEnd(1)
df['d3'] = df['d2'] + MonthEnd(1)...

(df['d1'] - df['A']).dt.days
(df['d2'] - df['A']).dt.days
(df['d3'] - df['A']).dt.days... 

1 个答案:

答案 0 :(得分:1)

使用numpy广播来减去值,将timedeltas转换为天并通过构造函数创建DataFrame:

df = pd.DataFrame({'A': ['3/31/2018', '6/22/2018', '7/5/2018']})
df['A'] = pd.to_datetime(df.A)

rng = pd.date_range('1/31/2019', periods=36, freq='M')

df = pd.DataFrame((rng.values - df['A'].values[:, None])
                  .astype("timedelta64[D]").astype(int), columns=rng)
print (df)
   2019-01-31  2019-02-28  2019-03-31  2019-04-30  2019-05-31  2019-06-30  \
0         306         334         365         395         426         456   
1         223         251         282         312         343         373   
2         210         238         269         299         330         360   

   2019-07-31  2019-08-31  2019-09-30  2019-10-31     ...      2021-03-31  \
0         487         518         548         579     ...            1096   
1         404         435         465         496     ...            1013   
2         391         422         452         483     ...            1000   

   2021-04-30  2021-05-31  2021-06-30  2021-07-31  2021-08-31  2021-09-30  \
0        1126        1157        1187        1218        1249        1279   
1        1043        1074        1104        1135        1166        1196   
2        1030        1061        1091        1122        1153        1183   

   2021-10-31  2021-11-30  2021-12-31  
0        1310        1340        1371  
1        1227        1257        1288  
2        1214        1244        1275  

[3 rows x 36 columns]