单元格中具有多个值的表的计数/数据透视表

时间:2020-11-12 14:41:29

标签: python python-3.x pandas dataframe

我有一些看起来像这样的数据:

Class                    Instructor
Intro to Philosophy      Jake
Algorithms               Ashley/Jake
Spanish I                Ashley
Vector Calculus          Jake
Intro to Philosophy      Jake

如何找到一个如下所示的计数或枢轴,在该计数或枢轴上,正确地将Ashley和Jake都教授课程的实例添加到计数中?一位讲师的实例很琐碎,但是同一单元中一个班级的两个或两个以上的实例会使我绊倒。

我想要得到这样的东西:

                         Jake        Ashley
Intro to Philosophy         2             0
Algorithms                  1             1
Spanish I                   0             1
Vector Calculus             1             0
Total                       4             2

3 个答案:

答案 0 :(得分:3)

您可以使用.str.get_dummies来对Instructor字段进行拆分和二进制化。然后,您可以按Class分组:

ret = (df['Instructor'].str.get_dummies('/')
     .groupby(df['Class']).sum()
)
ret.loc['Total'] = ret.sum()

输出:

                     Ashley  Jake
Class                            
Algorithms                1     1
Intro to Philosophy       0     2
Spanish I                 1     0
Vector Calculus           0     1
Total                     2     4

答案 1 :(得分:2)

您可以这样做:

In [1746]: df.Instructor = df.Instructor.str.split('/')

In [1747]: df = df.explode('Instructor')

In [1751]: x = df.groupby('Instructor').Class.value_counts().reset_index(level=0).pivot(columns='Instructor', values='Class').fillna(0)

In [1754]: x.loc['Total'] = x.sum()

In [1755]: x
Out[1755]: 
Instructor           Ashley  Jake
Class                            
Algorithms              1.0   1.0
Intro_to_Philosophy     0.0   2.0
Spanish_I               1.0   0.0
Vector_Calculus         0.0   1.0
Total                   2.0   4.0

答案 2 :(得分:1)

让我们在crosstab之后explode

df.Instructor = df.Instructor.str.split('/')

df = df.explode('Instructor')

out = pd.crosstab(df['Class'], df['Instructor'])