我看了看stackoverflow并找不到我要找的东西,所以如果这是一个重复的帖子,对不起,我非常感谢这个链接!
我有两个数据框:CarDF和duplicateCarDF
ID <- c(1,2,3,4,5,6,7,8)
car <- c("acura", "audi", "benz", "benz", "bmw", "toyota", "toyota", "jeep")
year <- c(2001, 2002, '2004', '2016','1999', '2017', '2017',2005)
CarDF <- data.frame(ID, car, year)
ID2 <-c(4,7)
car2 <- c("benz2", "toyota2")
year2 <- c(2016, 2017)
duplicateCarDF <- data.frame(ID = ID2, car = car2, year = year2)
我的目标是使用duplicateCarDF中基于ID的更新名称更新CarDF中的汽车。
我尝试了以下内容......
CarDF$car <- ifelse(duplicateCarDF$ID %in% CarDF$ID, duplicateCarDF$car, CarDF$car )
但是它将汽车名称改为benz2和toyota2。我只想更新ID 4和7的车。
非常感谢任何帮助!
答案 0 :(得分:4)
使用data.table ...
library(data.table)
setDT(CarDF)
CarDF[duplicateCarDF, on=.(ID), car := i.car]
ID car year
1: 1 acura 2001
2: 2 audi 2002
3: 3 benz 2004
4: 4 benz2 2016
5: 5 bmw 1999
6: 6 toyota 2017
7: 7 toyota2 2017
8: 8 jeep 2005
这有时称为“更新加入”。
答案 1 :(得分:2)
使用dplyr动词我们可以left_join
ID
,然后根据新值是否缺失有条件地替换car
。
library(dplyr)
CarDF %>%
left_join(
duplicateCarDF %>% # note: the year column doesn't add any
select(ID, new_car = car), # value here unless you have duplicated ID values
by = "ID"
) %>%
mutate(
car = if_else(
is.na(new_car),
as.character(car), # note: I'm coercing these to character because
as.character(new_car) # we've joined two df with different levels
)
) %>%
select(-new_car)
# ID car year
# 1 1 acura 2001
# 2 2 audi 2002
# 3 3 benz 2004
# 4 4 benz2 2016
# 5 5 bmw 1999
# 6 6 toyota 2017
# 7 7 toyota2 2017
# 8 8 jeep 2005
答案 2 :(得分:0)
base
解决方案可能会对sapply
的索引使用ifelse
。
CarDF$car <- sapply(CarDF$ID, function(x) {
ifelse(
nrow(duplicateCarDF[duplicateCarDF$ID == x, ]) == 0,
as.character(CarDF[CarDF$ID == x, ]$car),
as.character(duplicateCarDF[duplicateCarDF$ID == x, ]$car)
)
})
# [1] "acura" "audi" "benz" "benz2" "bmw" "toyota" "toyota2" "jeep"