通过映射多个列来创建两个新列

时间:2019-11-26 10:57:04

标签: r

如何匹配R中的列并提取值。作为示例:我想根据dataframe_one的Name和City列与dataframe_two进行匹配,然后返回具有另外两列temp和ID的输出。如果匹配,则也应返回TRUE和ID。

我的输入是:

dataframe_one

Name    City
Sarah   ON
David   BC
John    KN
Diana   AN
Judy    ON

dataframe_two

Name    City    ID
Dave    ON     1092
Diana   AN     2314
Judy    ON     1290
Ari     KN     1450
Shanu   MN     1983

我希望输出为

Name    City    temp    ID
Sarah   ON   FALSE     NA
David   BC   TRUE     1450
John    KN   TRUE     1983
Diana   AN   FALSE    NA
Judy    ON   FALSE    NA

2 个答案:

答案 0 :(得分:0)

使回答此类问题更容易的一件事是,是否至少将数据帧放在R中,如下所示:

df1 <- data.frame(stringsAsFactors=FALSE,
                  Name = c("Sarah", "David", "John", "Diana", "Judy"),
                  City = c("ON", "BC", "KN", "AN", "ON")
)

df2 <- data.frame(stringsAsFactors=FALSE,
                  Name = c("Dave", "Diana", "Judy", "Ari", "Shanyu"),
                  City = c("ON", "AN", "ON", "KN", "MN"),
                  ID = c(1092, 2314, 1290, 1450, 1983)
)

然后搜索已回答类似问题的现有Stack Overflow问题(例如How to join (merge) data frames (inner, outer, left, right))。

鉴于您的原始df都没有包含“ Temp”列,则需要在联接(合并)数据框中创建它。 如果您至少起步了,我们将为您提供更多帮助,然后社区将帮助您进行故障排除。

话虽这么说,但我终生无法弄清如何从输入中生成输出df。

答案 1 :(得分:0)

使用 biomiha 代码生成df1和df2:

df1 <- data.frame(stringsAsFactors=FALSE,
                  Name = c("Sarah", "David", "John", "Diana", "Judy"),
                  City = c("ON", "BC", "KN", "AN", "ON")
)

df2 <- data.frame(stringsAsFactors=FALSE,
                  Name = c("Dave", "Diana", "Judy", "Ari", "Shanyu"),
                  City = c("ON", "AN", "ON", "KN", "MN"),
                  ID = c(1092, 2314, 1290, 1450, 1983)
)

您可以尝试:

library(dplyr)

df1 %>% 
  left_join(df2, by = c("Name" = "Name", "City" = "City")) %>%
  mutate(temp = !is.na(ID))

给出输出:

   Name City   ID  temp
1 Sarah   ON   NA FALSE
2 David   BC   NA FALSE
3  John   KN   NA FALSE
4 Diana   AN 2314  TRUE
5  Judy   ON 1290  TRUE