I am trying to create a new column in an existing dataframe (df1) by comparing the values of column x to the values of column y in a different dataframe (df2).
The result should look like df_end. Where there is a match, the value of column x should be returned. Where there is no match, NA should be returned.
df1 <- data.frame(x = c("blue2", "blue6", "green9", "green7"))
df2 <- data.frame(y = c("blue2", "green9"))
df_end <- data.frame(x = c("blue2", "blue6", "green9", "green7"),
match = c("blue2", NA, "green9", NA))
I have experimented with merge, match and if,else statements, but I can't figure it out. Someone has a piece of advice for me?
#Attempt 1: Merge
df1$match <- merge(df1, df2, by.x = x, all = TRUE)
This does not work, because df1 and df2 are of different length.
答案 0 :(得分:1)
我做了以下事情:
df1 <- data.frame(x = c("blue2", "blue6", "green9", "green7"))
df2 <- data.frame(y = c("blue2", "green9"))
end <- sapply(df1$x, function(x) { # for each value in df1$x
j <- which(df2$y == x) # check if df2$y has a match
ifelse(length(j) > 0, j, NA) # if there is, give the location in the vector
}) # if not give NA
cbind(df1,df2, match = df2$y[end]) # subset the df2 with the location to get the characters
# x y match
#1 blue2 blue2 blue2
#2 blue6 green9 <NA>
#3 green9 blue2 green9
#4 green7 green9 <NA>
修改强>
见sotos&#39;评论最佳答案:df2$y[match(df1$x, df2$y)]