插入空白行熊猫数据框

时间:2018-11-29 12:15:06

标签: python python-3.x pandas dataframe

我有一列称为“ factor”的列,每当该列中的名称更改时,我想插入一个空白行,这可能吗?

for i in range(0, end):
    if df2.at[i + 1, 'factor'] != df2.at[i, 'factor']:

2 个答案:

答案 0 :(得分:2)

for循环中手动顺序插入行效率不高。或者,您可以找到发生更改的索引,构造一个新的数据框,连接,然后按索引排序:

df = pd.DataFrame([[1, 1], [2, 1], [3, 2], [4, 2],
                   [5, 2], [6, 3]], columns=['A', 'B'])

switches = df['B'].ne(df['B'].shift(-1))
idx = switches[switches].index

df_new = pd.DataFrame(index=idx + 0.5)
df = pd.concat([df, df_new]).sort_index()

print(df)

       A    B
0.0  1.0  1.0
1.0  2.0  1.0
1.5  NaN  NaN
2.0  3.0  2.0
3.0  4.0  2.0
4.0  5.0  2.0
4.5  NaN  NaN
5.0  6.0  3.0
5.5  NaN  NaN

如有必要,可以使用reset_index来规范索引:

print(df.reset_index(drop=True))

     A    B
0  1.0  1.0
1  2.0  1.0
2  NaN  NaN
3  3.0  2.0
4  4.0  2.0
5  5.0  2.0
6  NaN  NaN
7  6.0  3.0
8  NaN  NaN

答案 1 :(得分:1)

使用Float64Index边的indices边的reindex加上原始索引的union,添加到0.5中。

df2 = pd.DataFrame({'factor':list('aaabbccdd')})

idx = df2.index.union(df2.index[df2['factor'].shift(-1).ne(df2['factor'])] + .5)[:-1]
print (idx)
Float64Index([0.0, 1.0, 2.0, 2.5, 3.0, 4.0, 4.5, 5.0, 6.0, 6.5, 7.0, 8.0], dtype='float64')

df2 = df2.reindex(idx, fill_value='').reset_index(drop=True)
print (df2)
   factor
0       a
1       a
2       a
3        
4       b
5       b
6        
7       c
8       c
9        
10      d
11      d

如果要缺少值:

df2 = df2.reindex(idx).reset_index(drop=True)
print (df2)
   factor
0       a
1       a
2       a
3     NaN
4       b
5       b
6     NaN
7       c
8       c
9     NaN
10      d
11      d