我使用多选问题进行调查,其中输出在一列中用逗号分隔,以及分组问题(例如,性别)。现在我想把这两个变量交叉列表。
我的数据包含2列:
dat <- data.frame(Multiple = c("A,B,C","B","A,C"), Sex = c("M","F","F"))
我想用性别交叉制表多个选择选项(不带逗号):
Multiple Sex Count
A M 1
B M 1
C M 1
A F 1
B F 1
C F 1
这是一个部分解决方案,我只计算多选问题中的元素。我的问题是我不知道如何将分组变量性别包含到此函数中,因为我使用正则表达式来计算逗号分隔向量中的元素:
MSCount <- function(X){
# Function to count values in a comma separated vector
Answers <- sort(
unique(
unlist(
strsplit(
as.character(X), ",")))) # Find the possible options from the data alone, e.g. "A", "B" etc.
Answers <- Answers[-which(Answers == "")] # Drop blank answers
CountAnswers <- numeric(0) # Initialise the count as an empty numeric list
for(i in 1:length(Answers)){
CountAnswers[i] <- sum(grepl(Answers[i],X))
} # Loop round and count the rows with a match for the answer text
SummaryAnswers <- data.frame(Answers,CountAnswers,PropAnswers = 100*CountAnswers/length(X[!is.na(X)]))
return(SummaryAnswers)
}
答案 0 :(得分:1)
我们可以使用separate_rows
library(tidyverse)
separate_rows(dat, Multiple) %>%
mutate(Count = 1) %>%
arrange(Sex, Multiple) %>%
select(Multiple, Sex, Count)