从Python开始,我遇到了一个普遍存在的问题,但找不到一个简单的解决方案。我有一些虚构的问卷调查数据,希望获得有意义的描述。具体来说,对于每个问题,我想知道特定回答(“是” /“也许” /“否”)给出了多少次。
输入:
Question1 Question2 Question3
Answer1 Maybe Yes Yes
Answer2 No Maybe Yes
Answer3 Maybe Maybe No
Answer4 No Yes Maybe
现在,我想对某个问题的特定答案进行一次详尽的概述。首选输出将是这样的:
(首选)输出:
Yes Maybe No
Question1 0 2 2
Question2 2 2 0
Question3 2 1 1
我自己的想法是解决方案必须在“ groupby”命令中。到目前为止,我还没有成功获得任何有意义的输出:
df.groupby(['Question1']).sum()
Question2 Question3
Question1
Maybe YesMaybe YesNo
No MaybeYes YesMaybe
我用以下方法生成了虚拟数据:
# Generate data
data = np.array([['','Question1','Question2','Question3'],['Answer1',"Maybe","Yes","Yes"],['Answer2',"No","Maybe","Yes"],['Answer3',"Maybe","Maybe","No"],['Answer4',"No","Yes","Maybe"]])
# convert to pandas dataframe
df = pd.DataFrame(data=data[1:,1:],index=data[1:,0],columns=data[0,1:])
我知道这肯定是一个简单的挑战,但是任何帮助将不胜感激。
答案 0 :(得分:1)
简单
except urllib2.HTTPError as err
如果需要,可以将其df.apply(pd.value_counts).fillna(0)
Question1 Question2 Question3
Maybe 2.0 2.0 1.0
No 2.0 0.0 1.0
Yes 0.0 2.0 2.0
df.apply(pd.value_counts).fillna(0).T