Question

我有一个只有类别值“同意”，“不同意”和“不确定”的数据框。我只想用数字2替换“同意”，用1替换“不同意”，用0.5替换“不确定”，这样我就可以添加它们并得到一个分数。

我发现mapvalues仅适用于因子和向量，而且我不知道如何使用as.numeric，因此我可以指定应将哪些值分配给分类变量。另外，我实际上不能替换数据框中的值，它只是创建一个新的值，像其中包含三个数字的数据框一样。

Answer 1

由于您不提供示例数据，因此我们生成一个包含10个随机元素“同意”，“不同意”，“不确定”的向量

set.seed(2017)
ss <- sample(c("Agree", "Disagree", "Not Certain"), 10, replace = T)

我们为每个字符串指定数字值，并使用match将字符串条目映射到值

val <- c("Agree" = 2, "Disagree" = 1, "Not Certain" = 0.5)
val[match(ss, names(val))]
#Not Certain    Disagree    Disagree       Agree Not Certain Not Certain
#        0.5         1.0         1.0         2.0         0.5         0.5
#      Agree    Disagree    Disagree       Agree
#        2.0         1.0         1.0         2.0

总而言之，我们可以做

sum(val[match(ss, names(val))])
#[1] 11.5

Answer 2

在上述情况下，可以使用dplyr::case_when，因为OP希望替换data.frame所有列中的字符串值。

library(dplyr)
df %>% mutate_all(funs(case_when(
  . == "Agree" ~ 2,
  . == "Disagree" ~ 1,
  . == "Not Certain"  ~ 0.5
)))

#    FirstCol SecondCol ThirdCol
# 1       2.0       2.0      0.5
# 2       1.0       2.0      2.0
# 3       1.0       0.5      1.0
# 4       0.5       1.0      2.0
# 5       2.0       0.5      2.0
# 6       0.5       1.0      1.0
# 7       0.5       0.5      2.0
# 8       1.0       0.5      1.0
# 9       1.0       1.0      0.5
# 10      2.0       0.5      1.0

数据：样本数据

choices <- c("Agree", "Disagree", "Not Certain")
set.seed(1)

df <- data.frame(FirstCol = sample(choices, 10, replace = TRUE ),
                 SecondCol = sample(choices, 10, replace = TRUE ),
                 ThirdCol = sample(choices, 10, replace = TRUE ),
                 stringsAsFactors = FALSE)

 df
#       FirstCol   SecondCol    ThirdCol
# 1        Agree       Agree Not Certain
# 2     Disagree       Agree       Agree
# 3     Disagree Not Certain    Disagree
# 4  Not Certain    Disagree       Agree
# 5        Agree Not Certain       Agree
# 6  Not Certain    Disagree    Disagree
# 7  Not Certain Not Certain       Agree
# 8     Disagree Not Certain    Disagree
# 9     Disagree    Disagree Not Certain
# 10       Agree Not Certain    Disagree

如何在整个数据框中用自定义的数值完全替换类别值？ [R

2 个答案: