数据框示例
Date Code Count_1 Count_2
01-01-2021 A 5 4
04-01-2021 A 5 5
05-01-2021 A 7 5
07-01-2021 A 6 7
05-02-2021 B 8 7
09-02-2021 B 4 3
10-02-2021 B 4 5
11-02-2021 B 6 7
.
.
.
如何添加 starting date
(01-01-2021) 和 current date
(按 Code
分组)之间的所有缺失值(如果有),而 {{1} 中的值} 和 Count_1
为相关缺失日期等于 NaN。
答案 0 :(得分:1)
将 DataFrame.reindex
与唯一的 code
和 date_range
用于日期和代码对的所有组合:
df['Date'] = pd.to_datetime(df['Date'])
idx = pd.date_range('01-01-2021', pd.to_datetime('now').normalize())
codes = df['Code'].unique()
df1 = (df.set_index(['Date','Code'])
.reindex(pd.MultiIndex.from_product([idx, codes], names=['Date','Code']))
.reset_index())
print (df1)
Date Code Count_1 Count_2
0 2021-01-01 A 5.0 4.0
1 2021-01-01 B NaN NaN
2 2021-01-02 A NaN NaN
3 2021-01-02 B NaN NaN
4 2021-01-03 A NaN NaN
.. ... ... ... ...
345 2021-06-22 B NaN NaN
346 2021-06-23 A NaN NaN
347 2021-06-23 B NaN NaN
348 2021-06-24 A NaN NaN
349 2021-06-24 B NaN NaN
[350 rows x 4 columns]