我从一项调查中得到以下数据集,该数据集为每个参与者提供了一份食物清单,并请他们对他们本周食用食物的可能性进行排名。我想在图表上画出每种食物的每种可能性的计数。
Person Food Label
John Pizza Likely
John Chinese Unlikely
John French Very Unlikely
Debbie Pizza Unlikely
Debbie Chinese Very Likely
Debbie French Very Unlikely
例如:
Pizza Likely 1
Pizza Unlikely 1
Chinese Unlikely 1
Chinese Very Unlikely 1
French Very Unlikely 2
到目前为止,我已经将文件读入数据帧并进行了一些基本清理。
import pandas as pd
raw_data = pd.read_excel('my_file_path')
#cleaning code
clean_data = raw_data(clean)
results = clean_data.groupby(['Food', 'Label']).count()
答案 0 :(得分:3)
我相信您需要在Person
之后添加列groupby
,用unstack
重塑形状,并用DataFrame.plot.bar
绘制:
results = clean_data.groupby(['Food', 'Label'])['Person'].count().unstack(fill_value=0)
使用crosstab
的另一种解决方案:
results = pd.crosstab(clean_data['Food'], clean_data['Label'])
print (results)
Label Likely Unlikely Very Likely Very Unlikely
Food
Chinese 0 1 1 0
French 0 0 0 2
Pizza 1 1 0 0
results.plot.bar()