在Pandas Dataframe中追加或添加行

时间:2016-09-07 09:47:38

标签: python csv pandas append pivot-table

在下面的DataFrame中,如果A列中的值计数小于10,我想添加行。

例如,在下表中,A组60出现12次,但gorup 61出现9次。我想在组61的最后一个记录之后添加一行,并从相应的值组60中复制B,C,D列中的值。对于组62的类似操作,依此类推。

     A       B   C      D
0   60   0.235   4   7.86
1   60   1.235   5   8.86
2   60   2.235   6   9.86
3   60   3.235   7  10.86
4   60   4.235   8  11.86
5   60   5.235   9  12.86
6   60   6.235  10  13.86
7   60   7.235  11  14.86
8   60   8.235  12  15.86
9   60   9.235  13  16.86
10  60  10.235  14  17.86
11  60  11.235  15  18.86
12  61  12.235  16  19.86
13  61  13.235  17  20.86
14  61  14.235  18  21.86
15  61  15.235  19  22.86
16  61  16.235  20  23.86
17  61  17.235  21  24.86
18  61  18.235  22  25.86
19  61  19.235  23  26.86
20  61  20.235  24  27.86
21  62  20.235  24  28.86
22  62  20.235  24  29.86
23  62  20.235  24  30.86
24  62  20.235  24  31.86
25  62  20.235  24  32.86

1 个答案:

答案 0 :(得分:2)

您可以使用:

#cumulative count per group
df['G'] = df.groupby('A').cumcount()

df = df.groupby(['A','G'])
       .first()   #agregate first
       .unstack() #reshape DataFrame
       .ffill()   #same as fillna(method='ffill')
       .stack()   #get original shape
       .reset_index(drop=True, level=1) #remove level G in index
       .reset_index() 

print (df)
     A       B     C      D
0   60   0.235   4.0   7.86
1   60   1.235   5.0   8.86
2   60   2.235   6.0   9.86
3   60   3.235   7.0  10.86
4   60   4.235   8.0  11.86
5   60   5.235   9.0  12.86
6   60   6.235  10.0  13.86
7   60   7.235  11.0  14.86
8   60   8.235  12.0  15.86
9   60   9.235  13.0  16.86
10  60  10.235  14.0  17.86
11  60  11.235  15.0  18.86
12  61  12.235  16.0  19.86
13  61  13.235  17.0  20.86
14  61  14.235  18.0  21.86
15  61  15.235  19.0  22.86
16  61  16.235  20.0  23.86
17  61  17.235  21.0  24.86
18  61  18.235  22.0  25.86
19  61  19.235  23.0  26.86
20  61  20.235  24.0  27.86
21  61   9.235  13.0  16.86
22  61  10.235  14.0  17.86
23  61  11.235  15.0  18.86
24  62  20.235  24.0  28.86
25  62  20.235  24.0  29.86
26  62  20.235  24.0  30.86
27  62  20.235  24.0  31.86
28  62  20.235  24.0  32.86
29  62  17.235  21.0  24.86
30  62  18.235  22.0  25.86
31  62  19.235  23.0  26.86
32  62  20.235  24.0  27.86
33  62   9.235  13.0  16.86
34  62  10.235  14.0  17.86
35  62  11.235  15.0  18.86

pivot_table的另一个解决方案:

df['G'] = df.groupby('A').cumcount()

df = df.pivot_table(index='A', columns='G')
       .ffill()
       .stack()
       .reset_index(drop=True, level=1)
       .reset_index() 

print (df)
     A       B     C      D
0   60   0.235   4.0   7.86
1   60   1.235   5.0   8.86
2   60   2.235   6.0   9.86
3   60   3.235   7.0  10.86
4   60   4.235   8.0  11.86
5   60   5.235   9.0  12.86
6   60   6.235  10.0  13.86
7   60   7.235  11.0  14.86
8   60   8.235  12.0  15.86
9   60   9.235  13.0  16.86
10  60  10.235  14.0  17.86
11  60  11.235  15.0  18.86
12  61  12.235  16.0  19.86
13  61  13.235  17.0  20.86
14  61  14.235  18.0  21.86
15  61  15.235  19.0  22.86
16  61  16.235  20.0  23.86
17  61  17.235  21.0  24.86
18  61  18.235  22.0  25.86
19  61  19.235  23.0  26.86
20  61  20.235  24.0  27.86
21  61   9.235  13.0  16.86
22  61  10.235  14.0  17.86
23  61  11.235  15.0  18.86
24  62  20.235  24.0  28.86
25  62  20.235  24.0  29.86
26  62  20.235  24.0  30.86
27  62  20.235  24.0  31.86
28  62  20.235  24.0  32.86
29  62  17.235  21.0  24.86
30  62  18.235  22.0  25.86
31  62  19.235  23.0  26.86
32  62  20.235  24.0  27.86
33  62   9.235  13.0  16.86
34  62  10.235  14.0  17.86
35  62  11.235  15.0  18.86