I have a DataFrame:
df = pd.DataFrame({'B':[2,1,2],'C':['a','b','a']})
B C
0 2 'a'
1 1 'b'
2 2 'a'
I want to insert a row above any occurrence of 'b', that is a duplicate of that row but with 'b' changed to 'c', so I end up with this:
B C
0 2 'a'
1 1 'b'
1 1 'c'
2 2 'a'
For the life of me, I can't figure out how to do this.
答案 0 :(得分:4)
Here's one way of doing it:
duplicates = df[df['C'] == 'b'].copy()
duplicates['C'] = 'c'
df.append(duplicates).sort_index()
答案 1 :(得分:1)
Working at NumPy level, here's a vectorized approach -
arr = df.values
idx = np.flatnonzero(df.C=='b')
newvals = arr[idx]
newvals[:,df.columns.get_loc("C")] = 'c'
out = np.insert(arr,idx+1,newvals,axis=0)
df_index = np.insert(np.arange(arr.shape[0]),idx+1,idx,axis=0)
df_out = pd.DataFrame(out,index=df_index)
Sample run -
In [149]: df
Out[149]:
B C
0 2 a
1 1 b
2 2 d
3 4 d
4 3 b
5 8 a
6 4 a
7 2 b
In [150]: df_out
Out[150]:
0 1
0 2 a
1 1 b
1 1 c
2 2 d
3 4 d
4 3 b
4 3 c
5 8 a
6 4 a
7 2 b
7 2 c