熊猫唯一值作为带有计数的列

时间:2020-10-23 19:30:28

标签: python pandas dataframe

使用pandas数据框,并尝试在分组输出中将其翻转,该输出采用唯一值并将其作为一列,并将每个值的对应计数作为新数据框中的值。

这是起始数据帧:

df = pd.DataFrame([('gold', 'bronze', 'silver'),
                   ('silver', 'gold', 'bronze'),
                   ('gold', 'silver', 'bronze'),
                   ('bronze', 'silver', 'gold')],
                    columns=('Canada', 'China', 'South Korea'))
df.head()

    Canada  China   South Korea
0   gold    bronze  silver
1   silver  gold    bronze
2   gold    silver  bronze
3   bronze  silver  gold

所需的输出将是这样的:

    nation      gold    silver  bronze
0   Canada        2          1       1
1   China         1          2       1
2   South Korea   1          1       2

4 个答案:

答案 0 :(得分:3)

您可以将df.applypd.value_counts *

一起使用
df.apply(pd.value_counts).T

             bronze  gold  silver
Canada            1     2       1
China             1     1       2
South Korea       2     1       1

*我找不到pd.value_counts的文档,因此将github链接链接到该函数。

编辑:在读取源代码pd.Series.value_counts时,仅调用pd.value_counts

答案 1 :(得分:1)

使用pd.get_dummiessum

pd.get_dummies(df.T, prefix='',prefix_sep='').sum(level=0,axis=1)

Out[995]:
             bronze  gold  silver
Canada            1     2       1
China             1     1       2
South Korea       2     1       1

答案 2 :(得分:1)

w = df.melt()

    variable    value
0   Canada      gold
1   Canada      silver
2   Canada      gold
3   Canada      bronze
4   China       bronze
5   China       gold
6   China       silver
7   China       silver
8   South Korea silver
9   South Korea bronze
10  South Korea bronze
11  South Korea gold

然后:

pd.crosstab(w['variable'],w['value'])

所需结果:

value        bronze gold    silver
variable            
Canada        1      2       1
China         1      1       2
South Korea   2      1       1

答案 3 :(得分:0)

df = pd.DataFrame([('gold', 'bronze', 'silver'),
               ('silver', 'gold', 'bronze'),
               ('gold', 'silver', 'bronze'),
               ('bronze', 'silver', 'gold')],
                columns=('Canada', 'China', 'South Korea')).transpose()

df.apply(pd.value_counts,axis=1)