如何使用基于条件的值将列插入数据框中

时间:2016-08-13 09:50:13

标签: r dataframe

我有一个数据框,用这个:

df <- structure(list(samples = structure(c(1L, 2L, 5L, 6L), .Label = c("LAIV D0",
"LAIV D3", "LAIV D7", "TIV D0", "TIV D3", "TIV D7"), class = "factor"),
    celltype = structure(c(1L, 1L, 7L, 7L), .Label = c("Neutrophil",
    "Tcell", "Monocyte", "Bcell", "NKcell", "PlasmaCell", "DendriticCell"
    ), class = "factor"), score = c("0.1620678925564", "-0.0609851972808482",
    "0.198920574361332", "-0.106111265294409")), .Names = c("samples",
"celltype", "score"), row.names = c(1L, 2L, 1140L, 1141L), class = "data.frame")

看起来像这样:

> df
     samples      celltype               score
1    LAIV D0    Neutrophil     0.1620678925564
2    LAIV D3    Neutrophil -0.0609851972808482
1140  TIV D3 DendriticCell   0.198920574361332
1141  TIV D7 DendriticCell  -0.106111265294409

我想要做的是根据status中的子字符串插入一列samples。 如果LAIV下的字符串中存在samples,则状态为 control 。如果TIV状态为处理

所以最后它看起来像这样:

     samples      celltype               score  status 
1    LAIV D0    Neutrophil     0.1620678925564  control
2    LAIV D3    Neutrophil -0.0609851972808482  control
1140  TIV D3 DendriticCell   0.198920574361332  treated
1141  TIV D7 DendriticCell  -0.106111265294409  treated

我该怎么做?

1 个答案:

答案 0 :(得分:2)

您可以将grepl()ifelse()合并为:

df$status <- ifelse(grepl("LAIV", df$samples), "control", "treated")
#> df
#     samples      celltype               score  status
#1    LAIV D0    Neutrophil     0.1620678925564 control
#2    LAIV D3    Neutrophil -0.0609851972808482 control
#1140  TIV D3 DendriticCell   0.198920574361332 treated
#1141  TIV D7 DendriticCell  -0.106111265294409 treated

如果数据包含的观察结果不符合&#34;控制&#34;或者&#34;对待&#34;,如第三个或缺少类别,最好分别指定值而不是ifelse()

df$status[grepl("LAIV",df$samples)] <- "control"
df$status[grepl("TIV",df$samples)] <- "treated"

对于样本数据,结果是相同的。