我有一列['A'],上面有很多日期:
df['A'] = ['3/31/2018', '6/22/2018', '7/5/2018',...]
我还有一个由月末组成的日期范围:
rng = pd.date_range('1/31/2019', periods=36, freq='M')
我想根据计算返回36列:
rng - df['A']
我开始执行以下操作,但我知道这样做效率不高:
df['d1'] = pd.to_datetime('1/31/2019')
df['d2'] = df['d1'] + MonthEnd(1)
df['d3'] = df['d2'] + MonthEnd(1)...
(df['d1'] - df['A']).dt.days
(df['d2'] - df['A']).dt.days
(df['d3'] - df['A']).dt.days...
答案 0 :(得分:1)
使用numpy广播来减去值,将timedeltas转换为天并通过构造函数创建DataFrame:
df = pd.DataFrame({'A': ['3/31/2018', '6/22/2018', '7/5/2018']})
df['A'] = pd.to_datetime(df.A)
rng = pd.date_range('1/31/2019', periods=36, freq='M')
df = pd.DataFrame((rng.values - df['A'].values[:, None])
.astype("timedelta64[D]").astype(int), columns=rng)
print (df)
2019-01-31 2019-02-28 2019-03-31 2019-04-30 2019-05-31 2019-06-30 \
0 306 334 365 395 426 456
1 223 251 282 312 343 373
2 210 238 269 299 330 360
2019-07-31 2019-08-31 2019-09-30 2019-10-31 ... 2021-03-31 \
0 487 518 548 579 ... 1096
1 404 435 465 496 ... 1013
2 391 422 452 483 ... 1000
2021-04-30 2021-05-31 2021-06-30 2021-07-31 2021-08-31 2021-09-30 \
0 1126 1157 1187 1218 1249 1279
1 1043 1074 1104 1135 1166 1196
2 1030 1061 1091 1122 1153 1183
2021-10-31 2021-11-30 2021-12-31
0 1310 1340 1371
1 1227 1257 1288
2 1214 1244 1275
[3 rows x 36 columns]