如何匹配R中的列并提取值。作为示例:我想根据dataframe_one的Name和City列与dataframe_two进行匹配,然后返回具有另外两列temp和ID的输出。如果匹配,则也应返回TRUE和ID。
我的输入是:
dataframe_one
Name City
Sarah ON
David BC
John KN
Diana AN
Judy ON
dataframe_two
Name City ID
Dave ON 1092
Diana AN 2314
Judy ON 1290
Ari KN 1450
Shanu MN 1983
我希望输出为
Name City temp ID
Sarah ON FALSE NA
David BC TRUE 1450
John KN TRUE 1983
Diana AN FALSE NA
Judy ON FALSE NA
答案 0 :(得分:0)
使回答此类问题更容易的一件事是,是否至少将数据帧放在R中,如下所示:
df1 <- data.frame(stringsAsFactors=FALSE,
Name = c("Sarah", "David", "John", "Diana", "Judy"),
City = c("ON", "BC", "KN", "AN", "ON")
)
df2 <- data.frame(stringsAsFactors=FALSE,
Name = c("Dave", "Diana", "Judy", "Ari", "Shanyu"),
City = c("ON", "AN", "ON", "KN", "MN"),
ID = c(1092, 2314, 1290, 1450, 1983)
)
然后搜索已回答类似问题的现有Stack Overflow问题(例如How to join (merge) data frames (inner, outer, left, right))。
鉴于您的原始df都没有包含“ Temp”列,则需要在联接(合并)数据框中创建它。 如果您至少起步了,我们将为您提供更多帮助,然后社区将帮助您进行故障排除。
话虽这么说,但我终生无法弄清如何从输入中生成输出df。
答案 1 :(得分:0)
使用 biomiha 代码生成df1和df2:
df1 <- data.frame(stringsAsFactors=FALSE,
Name = c("Sarah", "David", "John", "Diana", "Judy"),
City = c("ON", "BC", "KN", "AN", "ON")
)
df2 <- data.frame(stringsAsFactors=FALSE,
Name = c("Dave", "Diana", "Judy", "Ari", "Shanyu"),
City = c("ON", "AN", "ON", "KN", "MN"),
ID = c(1092, 2314, 1290, 1450, 1983)
)
您可以尝试:
library(dplyr)
df1 %>%
left_join(df2, by = c("Name" = "Name", "City" = "City")) %>%
mutate(temp = !is.na(ID))
给出输出:
Name City ID temp
1 Sarah ON NA FALSE
2 David BC NA FALSE
3 John KN NA FALSE
4 Diana AN 2314 TRUE
5 Judy ON 1290 TRUE