Calculating mode in Pandas when using groupby

时间:2019-02-24 03:11:51

标签: python pandas pandas-groupby

I have a table as follows:

Col1 | Col2 | Col3
AAA  | 1    | a
AAA  | 1    | a
AAA  | 1    | b
AAA  | 2    | b
AAA  | 2    | b
AAA  | 2    | b
AAA  | 3    | a
BBB  | 1    | b
BBB  | 1    | b

I want to reduce the table in the following two steps:

  1. Find the most frequently occurring value in Col3 corresponding to the (Col1, Col2) value pair.

  2. From the result of step1, keep only the most frequently occurring value corresponding to Col1 value.

Applying step1 to the table above: The mode (or most frequently occurring value) corresponding to (AAA, 1) is a, and so on. We get:

Col1 | Col2 | newCol1
AAA  | 1    | a
AAA  | 2    | b
AAA  | 3    | a
BBB  | 1    | b

Applying step2 to this table, we see that a is the mode corresponding to AAA and b is the most frequently occurring value corresponding to BBB - so we get:

Col1 | newCol2
AAA  | a  
BBB  | b

2 个答案:

答案 0 :(得分:2)

所以你的意思是:

: A newer version of the box 'laravel/homestead' is available and already
: installed, but your Vagrant machine is running against
: version '6.3.0'. To update to version '7.1.0',
: destroy and recreate your machine.

答案 1 :(得分:2)

让我们一行完成

df.groupby(['Col1','Col2']).Col3.apply(pd.Series.mode).\ 
      groupby(level=0).apply(pd.Series.mode)
Out[136]: 
Col1   
AAA   0    a
BBB   0    b
Name: Col3, dtype: object

只是为了娱乐

pd.crosstab([df.Col1,df.Col2],df.Col3).idxmax(1).groupby(level=0).apply(pd.Series.mode)
Out[140]: 
Col1   
AAA   0    a
BBB   0    b
dtype: object