I would like to use dplyr's left_join to tranfer values ("new") from one DF to another.
How can I do that if I do not know the name of the key, but only know that it is the first variable in the dataset?
require("dplyr")
testData1 <- data.frame(idvar=c(1,2,3),
b=c("a","b","c"),
c=c("i","ii","iii"))
testData2 <- data.frame(identification=c(1,2),
b=c("a","b"),
c=c("i","NA"),
new=c("var1","var2"))
# now do a left join to obtain values of the new variable in the old dataset
(testResult1 <- left_join(testData1,testData2))
# var2 is not in the results because of the "NA" in testData2!
(testResult2 <- left_join(testData1,testData2,
by=c("idvar"="identification")))
# works as expected! ... but we do not know the name of the idvar!
(testResult3 <- left_join(testData1,testData2,
by=c(names(testData1)[1]=names(testData2)[1])))
# Error: unexpected '=' in:
# "testResult3 <- left_join(testData1,testData2,
# by=c(names(testData1)[1]="
答案 0 :(得分:3)
An alternative is to make the two key columns have the same name:
left_join(
testData1,
rename_at(testData2, 1, ~ names(testData1)[1]),
by = names(testData1)[1]
)
# idvar b.x c.x b.y c.y new
# 1 1 a i a i var1
# 2 2 b ii b NA var2
# 3 3 c iii <NA> <NA> <NA>
# > (testResult2 <- left_join(testData1,testData2, by=c("idvar"="identification")))
# idvar b.x c.x b.y c.y new
# 1 1 a i a i var1
# 2 2 b ii b NA var2
# 3 3 c iii <NA> <NA> <NA>
答案 1 :(得分:2)
You could create the named vector in advance and then join as follows:
join_by = colnames(testData2)[1]
names(join_by)=colnames(testData1)[1]
left_join(testData1,testData2, by=join_by)
or in one line:
left_join(testData1,testData2,
by=structure(colnames(testData2)[1], names=colnames(testData1)[1]))
or alternatively, as suggested by Artem:
left_join(testData1,testData2,
by=setNames(colnames(testData2)[1], colnames(testData1)[1]))
Hope this helps!