我有一个数据集,对Likert项目有90个回复,我想将其转换为数值。它的结构类似于以下示例:
q6 <- c("Daily", "Never", "Often", "Very Often", "Daily")
q7 <- c("Never", "Never", "Often", "Often", "Daily")
q23 <- c("Daily", "Often", "Never", "Never", "Neutral")
q17 <- c("Important", "Important", "Very Important", "Neutral", "Not Important")
example <- cbind(q6, q7, q17, q23)
对每个问题的回答略有不同,但主要是在强烈不同意,强烈不同意,每日从不,或重要到不重要的范围内。对90个问题的每个响应都在单独的列中(标记为q1> q90)。我想为响应集创建新列,其数值对应于文本响应(强烈同意(3)到非常不同意(-3),通过中性(0))。像这样
q6 <- c("Daily", "Never", "Often", "Very Often", "Daily")
n6 <- c(3,-3,1,2,3)
q17 <- c("Important", "Important", "Very Important", "Neutral", "Not Important")
n17 <- c(2,2,3,0,-3)
num_example <- cbind(q6, n6, q17, n17)
num_example
我已经设法使用下面的代码到目前为止,它生成一个名为n6的新变量,它匹配现有q6列中的文本响应,然后我可以使用cbind将其添加到现有数据框中。我的问题是:如何在90个问题的整个数据框架中自动执行此操作,而无需为每个响应运行下面的代码(即将q6更改为q7,然后更改为q8,依此类推)。
n6 <- ifelse(example$q6=="Daily", 3,
ifelse(h16$q6=="",0,
ifelse(h16$q6=="Very Often", 2,
ifelse(h16$q6=="Often", 1,
ifelse(h16$q6=="Neither Rarely nor Often", 0,
ifelse(h16$q6=="Rarely", -1,
ifelse(h16$q6=="Very Rarely", -2,
ifelse(h16$q6=="Never", -3,5
))))))))
为了进一步参考,列q6:q12,然后q23:30的响应范围从Daily到Never,如上例所示。列q17:q22的响应范围从非重要到非常重要,列q49:q90的响应范围从强烈同意到非常不同意。我试图找到一种更智能的方法来运行下面的代码在相关列(例如q6:12,q23:q30)上以一种方式生成一个新的数据框,其数值在名为n6的列中:n16,n23:30 ,而不是必须运行90次以下的代码!
希望这是对这个问题的明确解释。
谢谢。
答案 0 :(得分:3)
plyr
包有一个名为revalue
的函数。 Replace specified values with new values, in a factor or character vector.
这可能对此有帮助......
require(plyr)
example2 <- revalue(example, c("Daily"= "3", "Never"= "-3", "Often"= "1",
"Very Often"= "2", "Important" = "3", "Very Important"= "3",
"Neutral"= "0", "Not Important"= "-3" ))
q6 q7 q17 q23
[1,] "3" "-3" "2" "3"
[2,] "-3" "-3" "2" "1"
[3,] "1" "1" "3" "-3"
[4,] "2" "1" "0" "-3"
[5,] "3" "3" "-3" "0"
数据
q6 <- c("Daily", "Never", "Often", "Very Often", "Daily")
q7 <- c("Never", "Never", "Often", "Often", "Daily")
q23 <- c("Daily", "Often", "Never", "Never", "Neutral")
q17 <- c("Important", "Important", "Very Important", "Neutral", "Not Important")
example <- cbind(q6, q7, q17, q23)
Alterantively mapvalues
也有效
mapvalues(example, from = c("Daily", "Never", "Often", "Very Often",
,"Important", "Very Important", "Neutral", "Not Important"),
to = c(3,2,0,-3,2,3,0,-3))
答案 1 :(得分:1)
有更快的方法但是由于您已经完成了所有这些工作,请将当前流程转换为函数,然后使用sapply
遍历所有列:
请注意,我已将q6
更改为[,x]
:
numConvert <- function(x) ifelse(example[,x]=="Daily", 3,
ifelse(h16[,x]=="",0,
ifelse(h16[,x]=="Very Often", 2,
ifelse(h16[,x]=="Often", 1,
ifelse(h16[,x]=="Neither Rarely nor Often", 0,
ifelse(h16[,x]=="Rarely", -1,
ifelse(h16[,x]=="Very Rarely", -2,
ifelse(h16[,x]=="Never", -3,5
))))))))
现在,该函数接受列名并根据您的规范进行转换。试试吧:
h16 <- example
sapply(colnames(example), numConvert)
# q6 q7 q17 q23
# [1,] 3 -3 5 3
# [2,] -3 -3 5 1
# [3,] 1 1 5 -3
# [4,] 2 1 5 -3
# [5,] 3 3 5 5
修改强>
如果您想使用闪亮的新功能,请尝试使用case_when
提供的dplyr >= 0.5.0
:
library(dplyr)
factorise <- function(x) {
case_when(x %in% c("Daily", "Very Important") ~ 3,
x %in% c("Very Often", "Important") ~ 2,
x %in% c("Often") ~ 1,
x %in% c("Neutral") ~ 0,
x %in% c("Never", "Not Important") ~ -3)
}
sapply(example, factorise)
# q6 q7 q17 q23
# [1,] 3 -3 2 3
# [2,] -3 -3 2 1
# [3,] 1 1 3 -3
# [4,] 2 1 0 -3
# [5,] 3 3 -3 0
答案 2 :(得分:1)
如果你想使用base R,我建议使用命名向量来构建一个查找表,而不是嵌套多个ifelses
,例如:
n <- c('Daily'=3, 'Very Often'=2, 'Often'=1, 'Never'=-3)
n[q6]
#Daily Never Often Very Often Daily
# 3 -3 1 2 3
n[q7]
#Never Never Often Often Daily
# -3 -3 1 1 3