如何用零填充熊猫数据框中的缺失值?

时间:2020-04-21 12:28:59

标签: python pandas dataframe missing-data

我有一个具有以下值的pandas DataFrame:

df =

1970-01-01 00:00:18        1        1     0         1             0
1970-01-01 00:00:19        0        0     0         1             0
1970-01-01 00:00:20        0        0     0         1             0
1970-01-01 00:00:25        0        1     0         0             1
1970-01-01 00:00:26        0        0     0         0             1

现在,我想每隔一秒钟添加一行,并用零填充新行的值。

df =

1970-01-01 00:00:18        1        1     0         1             0
1970-01-01 00:00:19        0        0     0         1             0
1970-01-01 00:00:20        0        0     0         1             0
1970-01-01 00:00:21        0        0     0         0             0
1970-01-01 00:00:22        0        0     0         0             0
1970-01-01 00:00:23        0        0     0         0             0
1970-01-01 00:00:24        0        0     0         0             0
1970-01-01 00:00:25        0        1     0         0             1
1970-01-01 00:00:26        0        0     0         0             1

我研究了重新索引和重新采样,但没有找到使其工作的方法。

理想情况下,我还想从时间戳中删除“ 1970-01-01”部分。但这没有优先权。

1 个答案:

答案 0 :(得分:2)

使用DataFrame.asfreqDatetimeIndex一起使用,如有必要,最后将index转换为列:

print (df)
                  date  a  b  c  d  e
0  1970-01-01 00:00:18  1  1  0  1  0
1  1970-01-01 00:00:19  0  0  0  1  0
2  1970-01-01 00:00:20  0  0  0  1  0
3  1970-01-01 00:00:25  0  1  0  0  1
4  1970-01-01 00:00:26  0  0  0  0  1

df['date'] = pd.to_datetime(df['date'])

df = df.set_index('date').asfreq('S', fill_value=0).reset_index()
print (df)
                 date  a  b  c  d  e
0 1970-01-01 00:00:18  1  1  0  1  0
1 1970-01-01 00:00:19  0  0  0  1  0
2 1970-01-01 00:00:20  0  0  0  1  0
3 1970-01-01 00:00:21  0  0  0  0  0
4 1970-01-01 00:00:22  0  0  0  0  0
5 1970-01-01 00:00:23  0  0  0  0  0
6 1970-01-01 00:00:24  0  0  0  0  0
7 1970-01-01 00:00:25  0  1  0  0  1
8 1970-01-01 00:00:26  0  0  0  0  1