Question

在pandas文档（http://pandas.pydata.org/pandas-docs/stable/groupby.html）上，使用groupby和下面的get_letter_type函数的示例。为什么描述的结果不包括列＆＃39; B＆＃39;？

In [5]: def get_letter_type(letter):
   ...:     if letter.lower() in 'aeiou':
   ...:         return 'vowel'
   ...:     else:
   ...:         return 'consonant'
   ...: 
In [6]: grouped = df.groupby(get_letter_type, axis=1)
In [7]: grouped.describe()

结果显示here，没有B列。谁有人解释为什么？因为在我看来，B应该属于＆＃39; consnant＆＃39;小组，有什么我错过的吗？

Answer 1

对我而言，如果DataFrame只有A和B列，则可以使用

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
                          'foo', 'bar', 'foo', 'foo'],
                   'B' : ['one', 'one', 'two', 'three',
                          'two', 'two', 'one', 'three']})

def get_letter_type(letter):
    if letter.lower() in 'aeiou':
        return 'vowel'
    else:
        return 'consonant'


grouped = df.groupby(get_letter_type, axis=1)

for i, g in (grouped):
    print (i)
    print (g)

consonant
       B
0    one
1    one
2    two
3  three
4    two
5    two
6    one
7  three

vowel
     A
0  foo
1  bar
2  foo
3  bar
4  foo
5  bar
6  foo
7  foo    

print (grouped.describe())    
       consonant vowel
               B     A
count          8     8
unique         3     2
top          one   foo
freq           3     5

我认为有automatic exclusion of nuisance columns，如果是某个群组，例如consonant包含numeric和string列：

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
                          'foo', 'bar', 'foo', 'foo'],
                   'B' : ['one', 'one', 'two', 'three',
                          'two', 'two', 'one', 'three'],
                   'C' : np.random.randn(8),
                   'D' : np.random.randn(8)})

def get_letter_type(letter):
    if letter.lower() in 'aeiou':
        return 'vowel'
    else:
        return 'consonant'


grouped = df.groupby(get_letter_type, axis=1)

for i, g in (grouped):
    print (i)
    print (g)
    consonant
       B         C         D
0    one  0.322759  0.348806
1    one -0.122110 -1.566801
2    two  1.846408 -0.830144
3  three -0.509248  0.486773
4    two -1.061608 -0.069366
5    two  1.083728  0.429543
6    one -0.664480 -0.702906
7  three  0.587159  0.978647
vowel
     A
0  foo
1  bar
2  foo
3  bar
4  foo
5  bar
6  foo
7  foo

print (grouped.describe())    
       consonant           vowel
               C         D     A
25%    -0.548056 -0.734716   NaN
50%     0.100325  0.139720   NaN
75%     0.711301  0.443851   NaN
count   8.000000  8.000000     8
freq         NaN       NaN     5
max     1.846408  0.978647   NaN
mean    0.185326 -0.115681   NaN
min    -1.061608 -1.566801   NaN
std     0.971055  0.848251   NaN
top          NaN       NaN   foo
unique       NaN       NaN     2

为什么pandas groupby使用get_letter_type函数，＆＃39;辅音＆＃39;小组不包括＆＃39; B＆＃39;柱？

1 个答案: