df=pd.DataFrame(data={'a':[(999,777),(777,999),(777,999),(999,777),(299,331),(299,331),(543,829),(543,829),(829,543),(829,543),],'b':[44,32,42,15,65,36,92,57,77,42]})
df.groupby('a').aggregate('mean')
a b
(299, 331) 50.5
(543, 829) 74.5
(777, 999) 37.0
(829, 543) 59.5
(999, 777) 29.5
例如,我不想为元组(999,777)
和(777,999)
设置不同的行,我希望大熊猫将它们视为一个。我正在寻找的输出是
a b
(299, 331) 50.50
(543, 829) 67.00
(777, 999) 33.25
答案 0 :(得分:2)
使用已知的排序顺序,即对元组进行排序:
In [6]: df.groupby(df['a'].apply(lambda x: tuple(sorted(x)))).aggregate('mean')
Out[6]:
b
a
(299, 331) 50.50
(543, 829) 67.00
(777, 999) 33.25
答案 1 :(得分:1)
你可以预先排序:
In [45]:
df['a'] = df['a'].apply(lambda x: tuple(sorted(x)))
df.groupby('a').aggregate('mean')
Out[45]:
b
a
(299, 331) 50.50
(543, 829) 67.00
(777, 999) 33.25