Question

我有一些定性数据，我已编入各种类别，我想提供子组的摘要。 RQDA软件包非常适合编码访谈，但我一直在努力为开放式调查回复创建摘要。我设法将编码文件导出为HTML，然后复制/粘贴到Excel中。我现在有500行，其中所有类别都在不同的列中，但相同的代码可能出现在不同的列中。例如，一些数据：

a <- c("ResponseA", "ResponseB", "ResponseC", "ResponseD", "NA")
b <- c("ResponseD", "ResponseC", "NA", "NA","NA")
c <- c("ResponseB", "ResponseA", "ResponseE", "NA", "NA")
d <- c("ResponseC", "ResponseB", "ResponseA", "NA", "NA")
df <- data.frame (a,b,c,d)

我希望能够运行像

这样的东西

df$ResponseA <- recode (df$a | df$b | df$c, "
                        'ResponseA' = '1'; 
                         else='0' ")
df$ResponseB <- recode (df$a | df$b | df$c, "
                        'ResponseB' = '1'; 
                         else='0' ")

简而言之，我喜欢扫描9列并重新编码为单个二进制变量。

Answer 1

如果我正确理解了这个问题，也许你可以尝试这样的事情：

## Convert your data into a long format first
dfL <- cbind(id = sequence(nrow(df)), stack(lapply(df, as.character)))

## The next three lines are mostly cleanup
dfL$id <- factor(dfL$id, sequence(nrow(df)))
dfL$values[dfL$values == "NA"] <- NA
dfL <- dfL[complete.cases(dfL), ]

## `table` is the real workhorse here
cbind(df, (table(dfL[1:2]) > 0) * 1)
#           a         b         c         d ResponseA ResponseB ResponseC ResponseD ResponseE
# 1 ResponseA ResponseD ResponseB ResponseC         1         1         1         1         0
# 2 ResponseB ResponseC ResponseA ResponseB         1         1         1         0         0
# 3 ResponseC        NA ResponseE ResponseA         1         0         1         0         1
# 4 ResponseD        NA        NA        NA         0         0         0         1         0
# 5        NA        NA        NA        NA         0         0         0         0         0

您还可以尝试以下操作：

(table(rep(1:nrow(df), ncol(df)), unlist(df)) > 0) * 1L
#    
#     NA ResponseA ResponseB ResponseC ResponseD ResponseE
#   1  0         1         1         1         1         0
#   2  0         1         1         1         0         0
#   3  1         1         0         1         0         1
#   4  1         0         0         0         1         0
#   5  1         0         0         0         0         0

将多列重新编码为单个变量

1 个答案: