我有一个名为dd2
的数据框。我需要将Left.Gene.Symbols
和Right.Gene.Symbols
中的值粘贴到我只需使用下面的代码即可,但如果缺少值,我不希望粘贴NAs。我希望它看起来像combination
列,如result
所示。
mycode的
#to remove NAs
dd2[dd2 == 'NA'] <- NA
#pasting values together
result <- cbind(dd2,combination = paste(dd2[,"Left.Gene.Symbols"],dd2[,"Right.Gene.Symbols"],sep="*"))
数据
dd2<- structure(c("AMLM12001KP", "AMLM12001KP", "AMLM12001KP", "AMLM12001KP",
"AMLM12001KP", "AK2", "HFM1", "HFM1", "HFM1", "HFM1", NA, "PPT",
NA, "GGT", NA), .Dim = c(5L, 3L), .Dimnames = list(NULL, c("customer_sample_id",
"Left.Gene.Symbols", "Right.Gene.Symbols")))
结果
customer_sample_id Left.Gene.Symbols Right.Gene.Symbols combination
[1,] "AMLM12001KP" "AK2" NA AK2*
[2,] "AMLM12001KP" "HFM1" "PPT" HFM1*PPT
[3,] "AMLM12001KP" "HFM1" NA HFM1*
[4,] "AMLM12001KP" "HFM1" "GGT" HFM1*GGT
[5,] "AMLM12001KP" "HFM1" NA HFM1*
答案 0 :(得分:4)
您可以执行以下操作,暂时使用空字符NA
替换""
值。
cbind(
dd2,
combination = paste(dd2[,2], replace(dd2[,3], is.na(dd2[,3]), ""), sep = "*")
)
# customer_sample_id Left.Gene.Symbols Right.Gene.Symbols combinations
# [1,] "AMLM12001KP" "AK2" NA "AK2*"
# [2,] "AMLM12001KP" "HFM1" "PPT" "HFM1*PPT"
# [3,] "AMLM12001KP" "HFM1" NA "HFM1*"
# [4,] "AMLM12001KP" "HFM1" "GGT" "HFM1*GGT"
# [5,] "AMLM12001KP" "HFM1" NA "HFM1*"
当然,请将列名替换为上面的列号。我没有写它们,因为它们太长了。
答案 1 :(得分:3)
使用ifelse
ifelse(is.na(dd2[,3]),paste0(dd2[,2],"*"),paste(dd2[,2],dd2[,3],sep="*"))
#[1] "AK2*" "HFM1*PPT" "HFM1*" "HFM1*GGT" "HFM1*"
答案 2 :(得分:3)
我们可以使用NAer
中的qdap
和sprintf
library(qdap)
sprintf('%s*%s', dd2[,2],NAer(dd2[,3],''))
#[1] "AK2*" "HFM1*PPT" "HFM1*" "HFM1*GGT" "HFM1*"