数据框:
col1 col2
a 50
b 40
a 40
a 30
b 20
a 20
b 30
b 50
我需要根据col1对它们进行分组,并根据col2对每个组将它们从高到低排序 并找出组中连续行之间的差异。 日期框架:
col1 col_entity col2 diff
a a1 50 10
b a2 40 10
a a3 30 10
a a4 20 nan
b b1 40 10
a b4 50 10
b b3 30 10
b b2 20 nan
请帮助我 预先感谢
答案 0 :(得分:1)
看看是否有帮助:
#replaces any value that contains a string value, with a 0
df['col2'] = pd.to_numeric(df.col2, errors='coerce').fillna(0)
#sorts the column in ascending first and calculates the difference
df['diff']=df.sort_values(['col1','col2'],ascending=[1,1]).groupby('col1').diff()
#display the dataframe after sorting col1 in asc and col2 in desc
df.sort_values(['col1','col2'],ascending=[1,0])
出局:
答案 1 :(得分:0)
您可以使用assign和groupby col1,然后使用diff来计算差异。
(
df
.assign(diff = lambda x: x.groupby('col1').diff())
.sort_values(['col1','col2'],ascending=False)
)