使用熊猫总结分类问卷数据

时间:2018-07-02 00:53:11

标签: python pandas

从Python开始,我遇到了一个普遍存在的问题,但找不到一个简单的解决方案。我有一些虚构的问卷调查数据,希望获得有意义的描述。具体来说,对于每个问题,我想知道特定回答(“是” /“也许” /“否”)给出了多少次。

输入:

         Question1   Question2   Question3
Answer1  Maybe       Yes         Yes
Answer2  No          Maybe       Yes
Answer3  Maybe       Maybe       No
Answer4  No          Yes         Maybe

现在,我想对某个问题的特定答案进行一次详尽的概述。首选输出将是这样的:

(首选)输出:

           Yes     Maybe    No
Question1  0       2        2
Question2  2       2        0
Question3  2       1        1

我自己的想法是解决方案必须在“ groupby”命令中。到目前为止,我还没有成功获得任何有意义的输出:

df.groupby(['Question1']).sum()
      Question2 Question3
Question1                    
Maybe      YesMaybe     YesNo
No         MaybeYes  YesMaybe

我用以下方法生成了虚拟数据:

# Generate data
data = np.array([['','Question1','Question2','Question3'],['Answer1',"Maybe","Yes","Yes"],['Answer2',"No","Maybe","Yes"],['Answer3',"Maybe","Maybe","No"],['Answer4',"No","Yes","Maybe"]])          


# convert to pandas dataframe
df = pd.DataFrame(data=data[1:,1:],index=data[1:,0],columns=data[0,1:])

我知道这肯定是一个简单的挑战,但是任何帮助将不胜感激。

1 个答案:

答案 0 :(得分:1)

简单

except urllib2.HTTPError as err

如果需要,可以将其df.apply(pd.value_counts).fillna(0) Question1 Question2 Question3 Maybe 2.0 2.0 1.0 No 2.0 0.0 1.0 Yes 0.0 2.0 2.0

转置
df.apply(pd.value_counts).fillna(0).T