比较两个不同数据框中的列

时间:2013-02-11 16:59:07

标签: r

所以我有两个数据帧,我希望将第一个数据帧中的一列与另一个数据帧中的第二列匹配。

df = data.frame(source=c("XRHxl8gq","2b790Qqv","mrgapJpQ","EsMfIbv1","ujOBob24","ujOBob24","EsMfIbv1"),
                conv=c(362,247,222,160,86,65,34), all=c(19,17,26,12,22,25,11), intent=c(47,47,74,31,58,60,0))

df2 = data.frame(name=c("Bob","David","Mark","Sara","Alice","Cara","Chad","Donna","Elaine","Gary"),
                 source_id=c("XRHxl8gq","sr354136FH","2b790Qqv","myx645TH","mrgapJpQ","EsMfI546",
                             "ujOBob24","EsMfIbv1","fMHL45ts","sefihn"))

我想要最终得到的是与source_id的匹配源,以便我可以在df中插入一个名为的新列。

> df
    source conv all intent   who
1 XRHxl8gq  362  19     47   Bob
2 2b790Qqv  247  17     47  Mark
3 mrgapJpQ  222  26     74 Alice
4 EsMfIbv1  160  12     31  Cara
5 ujOBob24   86  22     58  Chad
6 ujOBob24   65  25     60  Chad
7 EsMfIbv1   34  11      0  Cara

# find what values in both columns are similar.
both = intersect(df[,1], df2[,2]) # IN BOTH COLUMNS

# create a new column in the original data frame.
df$who = c("")

# match up source with source_id.
str(df2)
df2$name = as.character(df2$name)
df$who[df$source %in% df2$source_id] <- df2$name
df

df$who[which(df$source %in% df2$source_id)]<-as.character(df2$name)
df

不幸的是,我似乎无法匹配列,以便每个源与每个source_id关联的名称匹配。

任何人都可以帮忙吗?

1 个答案:

答案 0 :(得分:8)

您正在寻找merge

merge(df,df2, by.x="source", by.y="source_id", sort=F)

#     source conv all intent  name
# 1 XRHxl8gq  362  19     47   Bob
# 2 2b790Qqv  247  17     47  Mark
# 3 mrgapJpQ  222  26     74 Alice
# 4 EsMfIbv1  160  12     31 Donna
# 5 EsMfIbv1   34  11      0 Donna
# 6 ujOBob24   86  22     58  Chad
# 7 ujOBob24   65  25     60  Chad