使用pandas数据框,并尝试在分组输出中将其翻转,该输出采用唯一值并将其作为一列,并将每个值的对应计数作为新数据框中的值。
这是起始数据帧:
df = pd.DataFrame([('gold', 'bronze', 'silver'),
('silver', 'gold', 'bronze'),
('gold', 'silver', 'bronze'),
('bronze', 'silver', 'gold')],
columns=('Canada', 'China', 'South Korea'))
df.head()
Canada China South Korea
0 gold bronze silver
1 silver gold bronze
2 gold silver bronze
3 bronze silver gold
所需的输出将是这样的:
nation gold silver bronze
0 Canada 2 1 1
1 China 1 2 1
2 South Korea 1 1 2
答案 0 :(得分:3)
您可以将df.apply
与pd.value_counts
*
df.apply(pd.value_counts).T
bronze gold silver
Canada 1 2 1
China 1 1 2
South Korea 2 1 1
*我找不到pd.value_counts
的文档,因此将github链接链接到该函数。
编辑:在读取源代码pd.Series.value_counts
时,仅调用pd.value_counts
答案 1 :(得分:1)
使用pd.get_dummies
和sum
pd.get_dummies(df.T, prefix='',prefix_sep='').sum(level=0,axis=1)
Out[995]:
bronze gold silver
Canada 1 2 1
China 1 1 2
South Korea 2 1 1
答案 2 :(得分:1)
w = df.melt()
variable value
0 Canada gold
1 Canada silver
2 Canada gold
3 Canada bronze
4 China bronze
5 China gold
6 China silver
7 China silver
8 South Korea silver
9 South Korea bronze
10 South Korea bronze
11 South Korea gold
然后:
pd.crosstab(w['variable'],w['value'])
所需结果:
value bronze gold silver
variable
Canada 1 2 1
China 1 1 2
South Korea 2 1 1
答案 3 :(得分:0)
df = pd.DataFrame([('gold', 'bronze', 'silver'),
('silver', 'gold', 'bronze'),
('gold', 'silver', 'bronze'),
('bronze', 'silver', 'gold')],
columns=('Canada', 'China', 'South Korea')).transpose()
df.apply(pd.value_counts,axis=1)