我已经看到了几种获取数据和按组创建计数的方法,但我想做的事情有点复杂...... 我有一个类似于下面的数据集:
d <- data.frame(ID=c("1ef","3ic","9sd"),
CI_Region=c("Bay Area","North Sierra","Central Valley"),
Q18_429=c("Not a threat","Slightly serious","Very Serious"),
Q18_430=c("Extremely serious","Somewhat serious","Slightly serious"),
Q18_431=c("Slightly serious","Unknown","No Answer"))
我希望按CI_Region进行分组,然后按问题计算每个回复的计数(例如&#34;不是威胁&#34;,&#34;稍微严重&#34;等等)。
最终结果是一个表格,按行和CI区域显示响应类别的计数。所以我能够看到湾区 - 问题18_429-不是威胁= 1.
提前致谢!
答案 0 :(得分:0)
d <- data.frame(ID=c("1ef","3ic","9sd"),
CI_Region=c("Bay Area","North Sierra","Central Valley"),
Q18_429=c("Not a threat","Slightly serious","Very Serious"),
Q18_430=c("Extremely serious","Somewhat serious","Slightly serious"),
Q18_431=c("Slightly serious","Unknown","No Answer"))
将数据重塑为更整洁的格式使其更容易。
library(tidyr)
gather(d, question, response, -ID, -CI_Region) %>%
group_by(CI_Region, question, response) %>%
tally()
CI_Region question response n
(fctr) (fctr) (chr) (int)
Bay Area Q18_429 Not a threat 1
Bay Area Q18_430 Extremely serious 1
Bay Area Q18_431 Slightly serious 1
Central Valley Q18_429 Very Serious 1
Central Valley Q18_430 Slightly serious 1
Central Valley Q18_431 No Answer 1
North Sierra Q18_429 Slightly serious 1
North Sierra Q18_430 Somewhat serious 1
North Sierra Q18_431 Unknown 1
这就是你想要的吗?