根据R中的另一个数据帧更新列值

时间:2018-02-15 00:12:03

标签: r dataframe

我看了看stackoverflow并找不到我要找的东西,所以如果这是一个重复的帖子,对不起,我非常感谢这个链接!

我有两个数据框:CarDF和duplicateCarDF

ID <- c(1,2,3,4,5,6,7,8)
car <- c("acura", "audi", "benz", "benz", "bmw", "toyota", "toyota", "jeep")
year <- c(2001, 2002, '2004', '2016','1999', '2017', '2017',2005)

CarDF <- data.frame(ID, car, year)

ID2 <-c(4,7)
car2 <- c("benz2", "toyota2")
year2 <- c(2016, 2017)

duplicateCarDF <- data.frame(ID = ID2, car = car2, year = year2)

我的目标是使用duplicateCarDF中基于ID的更新名称更新CarDF中的汽车。

我尝试了以下内容......

CarDF$car <- ifelse(duplicateCarDF$ID %in% CarDF$ID, duplicateCarDF$car, CarDF$car )

但是它将汽车名称改为benz2和toyota2。我只想更新ID 4和7的车。

非常感谢任何帮助!

3 个答案:

答案 0 :(得分:4)

使用data.table ...

library(data.table)
setDT(CarDF)

CarDF[duplicateCarDF, on=.(ID), car := i.car]

   ID     car year
1:  1   acura 2001
2:  2    audi 2002
3:  3    benz 2004
4:  4   benz2 2016
5:  5     bmw 1999
6:  6  toyota 2017
7:  7 toyota2 2017
8:  8    jeep 2005

这有时称为“更新加入”。

答案 1 :(得分:2)

使用动词我们可以left_join ID,然后根据新值是否缺失有条件地替换car

library(dplyr)

CarDF %>%
  left_join(
    duplicateCarDF %>%           # note: the year column doesn't add any
      select(ID, new_car = car), # value here unless you have duplicated ID values
    by = "ID"
  ) %>%
  mutate(
    car = if_else(
      is.na(new_car),
      as.character(car),    # note: I'm coercing these to character because
      as.character(new_car) # we've joined two df with different levels
    )
  ) %>%
  select(-new_car)

#   ID     car year
# 1  1   acura 2001
# 2  2    audi 2002
# 3  3    benz 2004
# 4  4   benz2 2016
# 5  5     bmw 1999
# 6  6  toyota 2017
# 7  7 toyota2 2017
# 8  8    jeep 2005

答案 2 :(得分:0)

base解决方案可能会对sapply的索引使用ifelse

CarDF$car <- sapply(CarDF$ID, function(x) {
  ifelse(
    nrow(duplicateCarDF[duplicateCarDF$ID == x, ]) == 0, 
    as.character(CarDF[CarDF$ID == x, ]$car),
    as.character(duplicateCarDF[duplicateCarDF$ID == x, ]$car)
  )
})

# [1] "acura"   "audi"    "benz"    "benz2"   "bmw"     "toyota"  "toyota2" "jeep"