我有一个只有类别值“同意”,“不同意”和“不确定”的数据框。我只想用数字2替换“同意”,用1替换“不同意”,用0.5替换“不确定”,这样我就可以添加它们并得到一个分数。
我发现mapvalues仅适用于因子和向量,而且我不知道如何使用as.numeric,因此我可以指定应将哪些值分配给分类变量。 另外,我实际上不能替换数据框中的值,它只是创建一个新的值,像其中包含三个数字的数据框一样。
答案 0 :(得分:0)
由于您不提供示例数据,因此我们生成一个包含10个随机元素“同意”,“不同意”,“不确定”的向量
set.seed(2017)
ss <- sample(c("Agree", "Disagree", "Not Certain"), 10, replace = T)
我们为每个字符串指定数字值,并使用match
将字符串条目映射到值
val <- c("Agree" = 2, "Disagree" = 1, "Not Certain" = 0.5)
val[match(ss, names(val))]
#Not Certain Disagree Disagree Agree Not Certain Not Certain
# 0.5 1.0 1.0 2.0 0.5 0.5
# Agree Disagree Disagree Agree
# 2.0 1.0 1.0 2.0
总而言之,我们可以做
sum(val[match(ss, names(val))])
#[1] 11.5
答案 1 :(得分:0)
在上述情况下,可以使用dplyr::case_when
,因为OP希望替换data.frame所有列中的字符串值。
library(dplyr)
df %>% mutate_all(funs(case_when(
. == "Agree" ~ 2,
. == "Disagree" ~ 1,
. == "Not Certain" ~ 0.5
)))
# FirstCol SecondCol ThirdCol
# 1 2.0 2.0 0.5
# 2 1.0 2.0 2.0
# 3 1.0 0.5 1.0
# 4 0.5 1.0 2.0
# 5 2.0 0.5 2.0
# 6 0.5 1.0 1.0
# 7 0.5 0.5 2.0
# 8 1.0 0.5 1.0
# 9 1.0 1.0 0.5
# 10 2.0 0.5 1.0
数据:样本数据
choices <- c("Agree", "Disagree", "Not Certain")
set.seed(1)
df <- data.frame(FirstCol = sample(choices, 10, replace = TRUE ),
SecondCol = sample(choices, 10, replace = TRUE ),
ThirdCol = sample(choices, 10, replace = TRUE ),
stringsAsFactors = FALSE)
df
# FirstCol SecondCol ThirdCol
# 1 Agree Agree Not Certain
# 2 Disagree Agree Agree
# 3 Disagree Not Certain Disagree
# 4 Not Certain Disagree Agree
# 5 Agree Not Certain Agree
# 6 Not Certain Disagree Disagree
# 7 Not Certain Not Certain Agree
# 8 Disagree Not Certain Disagree
# 9 Disagree Disagree Not Certain
# 10 Agree Not Certain Disagree