我有一些定性数据,我已编入各种类别,我想提供子组的摘要。 RQDA软件包非常适合编码访谈,但我一直在努力为开放式调查回复创建摘要。我设法将编码文件导出为HTML,然后复制/粘贴到Excel中。我现在有500行,其中所有类别都在不同的列中,但相同的代码可能出现在不同的列中。例如,一些数据:
a <- c("ResponseA", "ResponseB", "ResponseC", "ResponseD", "NA")
b <- c("ResponseD", "ResponseC", "NA", "NA","NA")
c <- c("ResponseB", "ResponseA", "ResponseE", "NA", "NA")
d <- c("ResponseC", "ResponseB", "ResponseA", "NA", "NA")
df <- data.frame (a,b,c,d)
我希望能够运行像
这样的东西df$ResponseA <- recode (df$a | df$b | df$c, "
'ResponseA' = '1';
else='0' ")
df$ResponseB <- recode (df$a | df$b | df$c, "
'ResponseB' = '1';
else='0' ")
简而言之,我喜欢扫描9列并重新编码为单个二进制变量。
答案 0 :(得分:1)
如果我正确理解了这个问题,也许你可以尝试这样的事情:
## Convert your data into a long format first
dfL <- cbind(id = sequence(nrow(df)), stack(lapply(df, as.character)))
## The next three lines are mostly cleanup
dfL$id <- factor(dfL$id, sequence(nrow(df)))
dfL$values[dfL$values == "NA"] <- NA
dfL <- dfL[complete.cases(dfL), ]
## `table` is the real workhorse here
cbind(df, (table(dfL[1:2]) > 0) * 1)
# a b c d ResponseA ResponseB ResponseC ResponseD ResponseE
# 1 ResponseA ResponseD ResponseB ResponseC 1 1 1 1 0
# 2 ResponseB ResponseC ResponseA ResponseB 1 1 1 0 0
# 3 ResponseC NA ResponseE ResponseA 1 0 1 0 1
# 4 ResponseD NA NA NA 0 0 0 1 0
# 5 NA NA NA NA 0 0 0 0 0
您还可以尝试以下操作:
(table(rep(1:nrow(df), ncol(df)), unlist(df)) > 0) * 1L
#
# NA ResponseA ResponseB ResponseC ResponseD ResponseE
# 1 0 1 1 1 1 0
# 2 0 1 1 1 0 0
# 3 1 1 0 1 0 1
# 4 1 0 0 0 1 0
# 5 1 0 0 0 0 0