Question

我正在尝试将字符变量重新编码为数值。

字符变量如下所示：

b <- c("Your category choice is correct", "Your category choice is incorrect", ...

我尝试了以下内容：

b_recoded <- ifelse(b = "Your category choice is correct",  
c(1), c(0))

我收到以下错误：

未使用的参数（b =“您的类别选择正确”）

我怎样才能让它发挥作用？我正在尝试将"Your category choice is correct"编码为1，将"Your category choice is incorrect"编码为0。

对不起基本问题。我还在学习。

Answer 1

如果您的变量是字符，则可以使用正则表达式来匹配值：

p <- "Your category choice is"
s <- sample(c("correct", "incorrect"), 100, replace = TRUE)
b <- paste(p, s)
( foo <- ifelse(grepl(" correct$", b), 1, ifelse(grepl(" incorrect$", b), 0, NA)) )
  [1] 1 1 0 1 1 0 0 0 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 1 0 1 0 1 0 1 0 0 1 0 0
 [38] 1 1 1 1 0 0 1 0 0 0 0 1 1 0 1 0 0 1 1 1 1 0 0 1 1 0 0 1 1 1 0 0 0 0 0 1 1
 [75] 1 0 0 0 1 0 0 0 0 1 1 0 1 1 0 1 0 1 1 0 0 0 1 1 1 0

Answer 2

ifelse语句中的问题是您对逻辑表达式使用单个等号。 =用于R中的顶级左分配。在函数调用中，这意味着您将参数b分配给"Your category choice is correct"。

要获得逻辑表达式，您需要使用两个等号==。以下代码确实有效（使用mropas数据）：

b <- c(rep("Your category choice is correct", 3),
        rep("Your category choice is incorrect", 5),
        rep("Your category choice is correct", 2))

b_recoded <- ifelse(b == "Your category choice is correct",  1, 0)

另请注意，我省略了c()函数，因为您不需要组合单个元素。

如果您从R开始，阅读其中一本介绍性手册或至少将其作为参考可能是有用的。这是我学习R时喜欢的一个：

http://cran.r-project.org/doc/contrib/Paradis-rdebuts_en.pdf

Answer 3

数据：

df <- c(rep("Your category choice is correct", 3),
        rep("Your category choice is incorrect", 5),
        rep("Your category choice is correct", 2))

这会将您的df更改为factor

df2 <- factor(df, labels = c(1,0))

一开始，处理因素可能会有点混乱。因此，如果您更愿意将其保留为课程numeric或integer，则可以做

df3 <- df
df3[df3 == "Your category choice is correct"] <- 1
df3[df3 == "Your category choice is incorrect"] <- 0
df3 <- as.integer(df3)

重新编码字符变量

3 个答案: