pandas-计算每个列中每个唯一值在DataFrame中出现的值

时间:2018-09-20 14:05:13

标签: python pandas dataframe pivot-table

假设我有一个DataFrame:

    term      score
0   this          0
1   that          1
2   the other     3
3   something     2
4   anything      1
5   the other     2
6   that          2
7   this          0
8   something     1

如何通过score列中的唯一值来对term列中的实例进行计数?产生如下结果:

    term      score 0     score 1     score 2     score 3
0   this            2           0           0           0
1   that            0           1           1           0
2   the other       0           0           1           1
3   something       0           1           1           0
4   anything        0           1           0           0

我在这里阅读过的相关问题包括Python Pandas counting and summing specific conditionsCOUNTIF in pandas python over multiple columns with multiple conditions,但似乎都不是我想要做的。 this question中提到的pivot_table似乎很有意义,但由于缺乏经验和熊猫文档的简短,我受到了阻碍。感谢您的任何建议。

2 个答案:

答案 0 :(得分:6)

groupbysize一起使用,并通过unstack进行整形,最后add_prefix

df = df.groupby(['term','score']).size().unstack(fill_value=0).add_prefix('score ')

或使用crosstab

df = pd.crosstab(df['term'],df['score']).add_prefix('score ')

pivot_table

df = (df.pivot_table(index='term',columns='score', aggfunc='size', fill_value=0)
        .add_prefix('score '))

print (df)
score      score 0  score 1  score 2  score 3
term                                         
anything         0        1        0        0
something        0        1        1        0
that             0        1        1        0
the other        0        0        1        1
this             2        0        0        0

答案 1 :(得分:6)

您还可以将get_dummiesset_indexsumlevel参数一起使用:

(pd.get_dummies(df.set_index('term'), columns=['score'], prefix_sep=' ')
   .sum(level=0)
   .reset_index())

输出:

        term  score 0  score 1  score 2  score 3
0       this        2        0        0        0
1       that        0        1        1        0
2  the other        0        0        1        1
3  something        0        1        1        0
4   anything        0        1        0        0