Question

当我在具有分类数据的数据集上尝试时，另一个标记为重复的解决方案给我一个错误。

我有一个包含几列的表。一列，即列A，具有0、1、2、3、4作为值。这些是特定条件下的代码。我正在尝试创建/将另一列Z列添加到表中，如果A列中的值为0，则为0，A列中的值为3或4，则为1。：

for (i in 1:nrow(pheno_table))
    if pheno_table$columnA == 0
     then pheno_table$newcolumnZ<-0
    elsif pheno_table$columnA == 3 | pheno_table$columnA == 4
     then pheno_table$newcolumnZ<-0

非常感谢@ see24！另外，我确实尝试过并设置工作目录等，但无法在文件夹中看到文件（我检查了路径）

    setwd('/pathtofolder/') 

    library(dplyr) df <- data.frame(A=  
    (originaltablefile$column_of_interest)) 
    newcolumn <- df %>% mutate
    (newcolumn = case_when(A == 0 ~ 0, A %in% c(3,4) ~ 1, 
    TRUE ~ NA_real_)) 
    finaltablefile <- cbind(originaltablefile,newcolumn)`

在我的文件夹中看不到finaltablefile。

Answer 1

我喜欢使用mutate包中的case_when和dplyr函数

library(dplyr)
df <- data.frame(A = c(1,2,3,4,0),B = c(3,4,5,6,7))
df2 <- df %>% mutate(Z = case_when(A == 0 ~ 0,
                            A %in% c(3,4) ~ 1,
                            TRUE ~ NA_real_))

我假设您想要不为1、3或4的行使用NA。TRUE部分表示如果以上都不是，那么...您必须使用{{1} }，因为NA_real_要求所有输出均为同一类型

R根据条件添加表

1 个答案: