使用数据框中的零填充缺少的行

时间:2017-11-27 14:01:58

标签: python pandas dataframe

现在我有一个DataFrame如下:

video_id  0   1   2   3    4   5   6   7   8   9  ...  53   54  55  56  
user_id                                           ...                        
0          0   0   0   0    0   0   0   0   0   0 ...   0    0   0   0     
1          2   0   4  13   16   2   0  10   6  45 ...   3  352   6   0    
2          0   0   0   0    0   0   0  11   0   0 ...   0    0   0   0     
3          4  13   0   8    0   0   5   9  12  11 ...  14   17   0   6     
4          0   0   4  13   25   4   0  33   0  39 ...   5    7   4   3     
6          2   0   0   0   12   0   0   0   2   0 ...  19    4   0   0     
7         33  59  52  59  113  53  29  32  59  82 ...  60  119  57  39     
9          0   0   0   0    5   0   0   1   0   4 ...  16    0   0   0     
10         0   0   0   0   40   0   0   0   0   0 ...  26    0   0   0     
11         2   2  32   3   12   3   3  11  19  10 ...  16    3   3   9    
12         0   0   0   0    0   0   0   7   0   0 ...   7    0   0   0     

我们可以看到缺少部分DataFrame,例如user_id_5user_id_8。我想要做的是用0填充这些行,如:

video_id  0   1   2   3    4   5   6   7   8   9  ...  53   54  55  56  
user_id                                           ...                        
0          0   0   0   0    0   0   0   0   0   0 ...   0    0   0   0     
1          2   0   4  13   16   2   0  10   6  45 ...   3  352   6   0    
2          0   0   0   0    0   0   0  11   0   0 ...   0    0   0   0     
3          4  13   0   8    0   0   5   9  12  11 ...  14   17   0   6     
4          0   0   4  13   25   4   0  33   0  39 ...   5    7   4   3
5          0   0   0   0    0   0   0   0   0   0 ...   0    0   0   0
6          2   0   0   0   12   0   0   0   2   0 ...  19    4   0   0     
7         33  59  52  59  113  53  29  32  59  82 ...  60  119  57  39 
8          0   0   0   0    0   0   0   0   0   0 ...   0    0   0   0    
9          0   0   0   0    5   0   0   1   0   4 ...  16    0   0   0     
10         0   0   0   0   40   0   0   0   0   0 ...  26    0   0   0     
11         2   2  32   3   12   3   3  11  19  10 ...  16    3   3   9    
12         0   0   0   0    0   0   0   7   0   0 ...   7    0   0   0 

这个问题有解决方法吗?

1 个答案:

答案 0 :(得分:2)

您可以使用arange + reindex -

df = df.reindex(np.arange(df.index.min(), df.index.max() + 1), fill_value=0)

假设您的指数意味着单调增加指数。

df

     0   1   2   3    4   5   6   7   8   9
0    0   0   0   0    0   0   0   0   0   0
1    2   0   4  13   16   2   0  10   6  45
2    0   0   0   0    0   0   0  11   0   0
3    4  13   0   8    0   0   5   9  12  11
4    0   0   4  13   25   4   0  33   0  39
6    2   0   0   0   12   0   0   0   2   0
7   33  59  52  59  113  53  29  32  59  82
9    0   0   0   0    5   0   0   1   0   4
10   0   0   0   0   40   0   0   0   0   0
11   2   2  32   3   12   3   3  11  19  10
12   0   0   0   0    0   0   0   7   0   0

df.reindex(np.arange(df.index.min(), df.index.max() + 1), fill_value=0)

     0   1   2   3    4   5   6   7   8   9
0    0   0   0   0    0   0   0   0   0   0
1    2   0   4  13   16   2   0  10   6  45
2    0   0   0   0    0   0   0  11   0   0
3    4  13   0   8    0   0   5   9  12  11
4    0   0   4  13   25   4   0  33   0  39
5    0   0   0   0    0   0   0   0   0   0    # <----- 
6    2   0   0   0   12   0   0   0   2   0
7   33  59  52  59  113  53  29  32  59  82
8    0   0   0   0    0   0   0   0   0   0    # <----- 
9    0   0   0   0    5   0   0   1   0   4
10   0   0   0   0   40   0   0   0   0   0
11   2   2  32   3   12   3   3  11  19  10
12   0   0   0   0    0   0   0   7   0   0