在数据框列调用中使用粘贴

时间:2017-06-14 15:14:17

标签: r dataframe paste

combine_cols<- function(primary,secondary,linker,column) {
require(data.table) 
a<-data.table("Sample"=primary[,linker], primary[,column])
b<-data.table("Sample"=secondary[,linker], secondary[,column])

c <- merge(a, b, by = "Sample", all=TRUE)
c[,Status := ifelse(!is.na(c[,paste0(column,".x")]), paste0(column,".x"), 
paste0(column,".y"))]
c[,`:=` (paste0(column,".x")=NULL, paste0(column,".y")= NULL)]

return(c)
}
mydata1<-data.frame("Sample"=c("100","101","102","103"),"Status"=c("Y","","","partial"))
mydata2<-data.frame("Sample"=c("100","101","102","103","106"),"Status"=c("NA","Y","","","Y"))
print((combine_cols(mydata1,mydata2,"Sample",c("Status"))))

我正在尝试创建一个合并拆分数据列的函数。 ifelse行无效,因为paste0(column,".x")被识别为字符而非列名称。如何确保c[,paste0(column,".x")]反映c$c[,paste0(column,".x")]?更好的是,如何修改此行以处理列名列表?

1 个答案:

答案 0 :(得分:0)

只需使用标准名称并重命名,它就会更具可读性。

a<-data.table("Sample"=primary[,linker], "tempname" =primary[,column])        # added tempname
b<-data.table("Sample"=secondary[,linker], "tempname" =secondary[,column])    # added tempname
c <- merge(a, b, by = "Sample", all=TRUE)
c[,Status := ifelse(!is.na(tempname.x),tempname.x,tempname.y)]
setnames(c,paste0("tempname",c(".x",".y")),paste0(column,c(".x",".y")))

以你的例子:

   Sample Status.x Status.y Status
1:    100        Y       NA      3
2:    101                 Y      1
3:    102                        1
4:    103  partial               2
5:    106       NA        Y      3

我不知道下一行(return之前)应该做什么,它会失败,但由于它不是问题的一部分(还),所以这里。