如果两列匹配,则返回行(i)

时间:2020-01-13 03:53:59

标签: r

我有两个数据集:

df1

ID        paddock    cow ID
90/123    10         09/123
90/124    11         09/124
90/125    11         09/124

df2

ID        paddock
09/123    20
09/124    21

我想将df1 $ cowID与df2 $ ID匹配,并为匹配的任何行返回df2 $ paddock。我当前的代码如下:

dt <- ifelse(df1$cowID %in% df2$ID, df2$paddock[i], NA)

但是我遇到了返回错误。有人可以指引我正确的方向吗?预先感谢!

4 个答案:

答案 0 :(得分:2)

您可以考虑加入数据集。

dplyr::left_join(df1, df2, by = c('cow ID', 'ID')

答案 1 :(得分:1)

您可能应该使用event_code

match

数据

df1$df2_paddock <- df2$paddock[match(df1$cow_ID, df2$ID)]
df1

#      ID paddock cow_ID df2_paddock
#1 90/123      10 09/123          20
#2 90/124      11 09/124          21

答案 2 :(得分:1)

您可以通过合并两个数据框并获取所需的列来实现。

使用Base R

df1 <-
  data.frame(
    ID = c("90/123", "90/124"),
    paddock = c(10, 11),
    cow_ID = c("09/123", "09/124")
  )

df2 <-
  data.frame(
    ID = c("90/123", "90/124"),
    paddock = c(20, 21)
  )

# Joining the two dataframes by ID then choosing coloum of interest
merge(df1, df2, by = c("ID"), suffixes = c(".x", ".y"))["paddock.y"]

# paddock.y
# 20
# 21

使用Dplyr

library(dplyr)

df1 <-
  data.frame(
    ID = c("90/123", "90/124"),
    paddock = c(10, 11),
    cow_ID = c("09/123", "09/124")
  )

df2 <-
  data.frame(
    ID = c("90/123", "90/124"),
    paddock = c(20, 21)
  )

# Joining the two dataframes by ID then choosing coloum of interest
df1 %>%
  inner_join(df2, by = c("ID"), suffixes = c(".x", ".y")) %>%
  select(paddock.y) %>%
  rename(paddock = paddock.y)

# paddock
# 20
# 21

答案 3 :(得分:0)

如果您想使用ifelse(),也许可以使用以下代码来实现

with(df2,ifelse(ID %in% df1$cow_ID,paddock,NA))

这样

> with(df2,ifelse(ID %in% df1$cow_ID,paddock,NA))
[1] 20 21

数据

df1 <- structure(list(ID = structure(1:3, .Label = c("90/123", "90/124", 
"90/125"), class = "factor"), paddock = c(10, 11, 11), cow_ID = structure(c(1L, 
2L, 2L), .Label = c("09/123", "09/124"), class = "factor")), class = "data.frame", row.names = c(NA, 
-3L))

df2 <- structure(list(ID = structure(1:2, .Label = c("09/123", "09/124"
), class = "factor"), paddock = c(20, 21)), class = "data.frame", row.names = c(NA, 
-2L))