在pandas中添加第3列连续行差异的列,前提是column1和column2相同。
col1 col2
A B
A B
C D
C D
C D
第一个o / p
col1 col2 col3_count
A B 2
A B 2
C D 3
C D 3
C D 3
第二次O / P
ol1 col2 col3_count diff
A B 2 Nan
A B 2 0
C D 3 Nan
C D 3 0
C D 3 0
答案 0 :(得分:1)
df_out = df.assign(col3_count=df.groupby(['col1','col2'])['col1'].transform('size'))
输出:
col1 col2 col3_count
0 A B 2
1 A B 2
2 C D 3
3 C D 3
4 C D 3
df_out.assign(diff=df_out.groupby(['col1','col2'])['col3_count'].diff())
输出:
col1 col2 col3_count diff
0 A B 2 NaN
1 A B 2 0.0
2 C D 3 NaN
3 C D 3 0.0
4 C D 3 0.0