Question

我有一个数据集，对Likert项目有90个回复，我想将其转换为数值。它的结构类似于以下示例：

q6 <- c("Daily", "Never", "Often", "Very Often", "Daily")
q7 <- c("Never", "Never", "Often", "Often", "Daily")
q23 <- c("Daily", "Often", "Never", "Never", "Neutral")
q17 <- c("Important", "Important", "Very Important", "Neutral", "Not Important")
example <- cbind(q6, q7, q17, q23)

对每个问题的回答略有不同，但主要是在强烈不同意，强烈不同意，每日从不，或重要到不重要的范围内。对90个问题的每个响应都在单独的列中（标记为q1> q90）。我想为响应集创建新列，其数值对应于文本响应（强烈同意（3）到非常不同意（-3），通过中性（0））。像这样

q6 <- c("Daily", "Never", "Often", "Very Often", "Daily")
n6 <- c(3,-3,1,2,3)
q17 <- c("Important", "Important", "Very Important", "Neutral", "Not Important")
n17 <- c(2,2,3,0,-3)
num_example <- cbind(q6, n6, q17, n17)
num_example

我已经设法使用下面的代码到目前为止，它生成一个名为n6的新变量，它匹配现有q6列中的文本响应，然后我可以使用cbind将其添加到现有数据框中。我的问题是：如何在90个问题的整个数据框架中自动执行此操作，而无需为每个响应运行下面的代码（即将q6更改为q7，然后更改为q8，依此类推）。

n6 <- ifelse(example$q6=="Daily", 3,
                  ifelse(h16$q6=="",0,
                  ifelse(h16$q6=="Very Often", 2,
                  ifelse(h16$q6=="Often", 1,
                  ifelse(h16$q6=="Neither Rarely nor Often", 0,
                  ifelse(h16$q6=="Rarely", -1,
                  ifelse(h16$q6=="Very Rarely", -2,
                  ifelse(h16$q6=="Never", -3,5
                         ))))))))

为了进一步参考，列q6：q12，然后q23：30的响应范围从Daily到Never，如上例所示。列q17：q22的响应范围从非重要到非常重要，列q49：q90的响应范围从强烈同意到非常不同意。我试图找到一种更智能的方法来运行下面的代码在相关列（例如q6：12，q23：q30）上以一种方式生成一个新的数据框，其数值在名为n6的列中：n16，n23：30 ，而不是必须运行90次以下的代码！

希望这是对这个问题的明确解释。

谢谢。

Answer 1

plyr包有一个名为revalue的函数。 Replace specified values with new values, in a factor or character vector.这可能对此有帮助......

 require(plyr)
 example2 <- revalue(example, c("Daily"= "3", "Never"= "-3", "Often"= "1",
             "Very Often"= "2", "Important" = "3", "Very Important"= "3",
              "Neutral"= "0", "Not Important"= "-3" ))  

     q6   q7   q17  q23 
[1,] "3"  "-3" "2"  "3" 
[2,] "-3" "-3" "2"  "1" 
[3,] "1"  "1"  "3"  "-3"
[4,] "2"  "1"  "0"  "-3"
[5,] "3"  "3"  "-3" "0"

数据

q6 <- c("Daily", "Never", "Often", "Very Often", "Daily")
q7 <- c("Never", "Never", "Often", "Often", "Daily")
q23 <- c("Daily", "Often", "Never", "Never", "Neutral")
q17 <- c("Important", "Important", "Very Important", "Neutral", "Not Important")
example <- cbind(q6, q7, q17, q23)

Alterantively mapvalues也有效

 mapvalues(example, from = c("Daily", "Never", "Often", "Very Often",
          ,"Important", "Very Important", "Neutral", "Not Important"),
          to = c(3,2,0,-3,2,3,0,-3))

Answer 2

有更快的方法但是由于您已经完成了所有这些工作，请将当前流程转换为函数，然后使用sapply遍历所有列：

请注意，我已将q6更改为[,x]：

numConvert <- function(x) ifelse(example[,x]=="Daily", 3,
             ifelse(h16[,x]=="",0,
                    ifelse(h16[,x]=="Very Often", 2,
                           ifelse(h16[,x]=="Often", 1,
                                  ifelse(h16[,x]=="Neither Rarely nor Often", 0,
                                         ifelse(h16[,x]=="Rarely", -1,
                                                ifelse(h16[,x]=="Very Rarely", -2,
                                                       ifelse(h16[,x]=="Never", -3,5
                                                       ))))))))

现在，该函数接受列名并根据您的规范进行转换。试试吧：

h16 <- example
sapply(colnames(example), numConvert)
#      q6 q7 q17 q23
# [1,]  3 -3   5   3
# [2,] -3 -3   5   1
# [3,]  1  1   5  -3
# [4,]  2  1   5  -3
# [5,]  3  3   5   5

修改

如果您想使用闪亮的新功能，请尝试使用case_when提供的dplyr >= 0.5.0：

library(dplyr) factorise <- function(x) { case_when(x %in% c("Daily", "Very Important") ~ 3, x %in% c("Very Often", "Important") ~ 2, x %in% c("Often") ~ 1, x %in% c("Neutral") ~ 0, x %in% c("Never", "Not Important") ~ -3) } sapply(example, factorise) # q6 q7 q17 q23 # [1,] 3 -3 2 3 # [2,] -3 -3 2 1 # [3,] 1 1 3 -3 # [4,] 2 1 0 -3 # [5,] 3 3 -3 0

Answer 3

如果你想使用base R，我建议使用命名向量来构建一个查找表，而不是嵌套多个ifelses，例如：

n <- c('Daily'=3, 'Very Often'=2, 'Often'=1, 'Never'=-3)
n[q6]
#Daily      Never      Often Very Often      Daily 
#    3         -3          1          2          3 
n[q7]
#Never Never Often Often Daily 
#   -3    -3     1     1     3

将李克特数据转换为数据框中的数字

3 个答案: