我需要左连接两个df(X1和X2),并只保留唯一的列。
如果我必须进行普通联接,则以下代码有效:
merge(X1, X2)
样本数据:
X1<- data.frame("Group.Name"=c("Group1","Group2","Group1","Group2","Group2","Group2","Group1"),
"Sub_group_name"=c("A","A","B","C","D","E","B"),
"new_col"=c("Aa","Aa","Ba","Ca","Da","Ea","Ba"),
"Total"=c(35,26,10,9,5,11,13))
X2<- data.frame("Group.Name"=c("Group1","Group2","Group1","Group2","Group2"),
"Sub_group_name"=c("A","A","B","C","D"),
"new_col_b"=c(351,261,101,91,51),
"Total_b"=c(35,26,10,9,5))
示例询问:
Merge column -> Group.Name
merged dataframe columns -> Group.Name,Sub_group_name,new_col,new_col_b,Total_b
下面的代码也给了我所有重复的列:
merge(x=X1,y=X2,by=c,all.x=TRUE)
我也无法指定各个列的名称,因为一个df中有100多个列。
我搜索了但找不到任何答案。任何帮助请
答案 0 :(得分:1)
一种简单的方法是执行常规的merge
,然后从X2
中删除多余的列,然后从任何名称中删除.x
。
out <- merge(x=X1,y=X2,by='c',all.x=TRUE)
# remove columns from X2
out <- out[!endsWith(names(out), '.y')]
# rename columns from X1
library(magrittr)
names(out)[endsWith(names(out), '.x')] %<>% substr(1, nchar(.) - 2)
out
# c a b d e
# 1 1 1 2 1 1
使用的数据:
X1 <- data.frame(a = 1, b = 2, c = 1, d = 1)
X2 <- data.frame(b = 1, c = 1, e = 1)