如何填充熊猫数据框中的缺失值

时间:2020-01-15 07:51:04

标签: python pandas dataframe

我有一个包含缺失值的数据框。

    index   month   value
    0       201501  100
    1       201507  172
    2       201602  181
    3       201605   98

我想用下面的列表填充上面数据框的缺失值。

    list = [201501, 201502, 201503 ... 201612]

我想要得到的结果...

    index   month   value
    0       201501  100
    1       201502  100
    2       201503  100
    3       201504  100
    4       201505  100
    5       201506  100
    6       201507  172
    7       201508  172
    ...
    ...
    23      201611   98
    34      201612   98

2 个答案:

答案 0 :(得分:2)

使用pandas.DataFrame.merge

l = list(range(201501,201509))

new_df = df.merge(pd.Series(l,name='month'),how='outer').sort_values('month').ffill()
new_df['index'] = range(new_df.shape[0])

输出:

   index   month  value
0      0  201501  100.0
4      1  201502  100.0
5      2  201503  100.0
6      3  201504  100.0
7      4  201505  100.0
8      5  201506  100.0
1      6  201507  172.0
9      7  201508  172.0
2      8  201602  181.0
3      9  201605   98.0

答案 1 :(得分:2)

设置

my_list = list(range(201501,201509))
df=df.drop('index',axis=1) #remove the column index after use pd.read_clipboard
print(df)
    month  value
0  201501    100
1  201507    172
2  201602    181
3  201605     98

pd.DataFrame.reindex

df = (df.set_index('month')
        .reindex( index = np.sort(np.unique(df['month'].tolist() + my_list)) )
        .ffill()
        .reset_index() )
print(df)
     month  value
0   201501  100.0
1   201502  100.0
2   201503  100.0
3   201504  100.0
4   201505  100.0
5   201506  100.0
6   201507  172.0
7   201508  172.0
8   201602  181.0
9   201605   98.0
10  201612   98.0