我有一个数据框,用这个:
df <- structure(list(samples = structure(c(1L, 2L, 5L, 6L), .Label = c("LAIV D0",
"LAIV D3", "LAIV D7", "TIV D0", "TIV D3", "TIV D7"), class = "factor"),
celltype = structure(c(1L, 1L, 7L, 7L), .Label = c("Neutrophil",
"Tcell", "Monocyte", "Bcell", "NKcell", "PlasmaCell", "DendriticCell"
), class = "factor"), score = c("0.1620678925564", "-0.0609851972808482",
"0.198920574361332", "-0.106111265294409")), .Names = c("samples",
"celltype", "score"), row.names = c(1L, 2L, 1140L, 1141L), class = "data.frame")
看起来像这样:
> df
samples celltype score
1 LAIV D0 Neutrophil 0.1620678925564
2 LAIV D3 Neutrophil -0.0609851972808482
1140 TIV D3 DendriticCell 0.198920574361332
1141 TIV D7 DendriticCell -0.106111265294409
我想要做的是根据status
中的子字符串插入一列samples
。
如果LAIV
下的字符串中存在samples
,则状态为 control 。如果TIV
状态为处理。
所以最后它看起来像这样:
samples celltype score status
1 LAIV D0 Neutrophil 0.1620678925564 control
2 LAIV D3 Neutrophil -0.0609851972808482 control
1140 TIV D3 DendriticCell 0.198920574361332 treated
1141 TIV D7 DendriticCell -0.106111265294409 treated
我该怎么做?
答案 0 :(得分:2)
您可以将grepl()
与ifelse()
合并为:
df$status <- ifelse(grepl("LAIV", df$samples), "control", "treated")
#> df
# samples celltype score status
#1 LAIV D0 Neutrophil 0.1620678925564 control
#2 LAIV D3 Neutrophil -0.0609851972808482 control
#1140 TIV D3 DendriticCell 0.198920574361332 treated
#1141 TIV D7 DendriticCell -0.106111265294409 treated
如果数据包含的观察结果不符合&#34;控制&#34;或者&#34;对待&#34;,如第三个或缺少类别,最好分别指定值而不是ifelse()
:
df$status[grepl("LAIV",df$samples)] <- "control"
df$status[grepl("TIV",df$samples)] <- "treated"
对于样本数据,结果是相同的。