我有两个数据帧:
state = c("CA","WA","OR","AZ")
first = c("Jim","Mick","Paul","Ron")
df1 <- data.frame(first, state)
df1
first state
1 Jim CA
2 Mick WA
3 Paul OR
4 Ron AZ
newstate = c("TX", "LA")
first =c("Jim","Mick")
df2 <- data.frame(first,newstate)
df2
first newstate
1 Jim TX
2 Mick LA
我正在使用qdaptools
查找功能:
library(qdaptools)
df1$match <- lookup(df1$first, df2[, 1:2])
> df1
first state match
1 Jim CA TX
2 Mick WA LA
3 Paul OR <NA>
4 Ron AZ <NA>
有没有办法忽略nomatch或nomatch按原样返回现有变量?这将是期望结果的示例:
first state match
1 Jim CA TX
2 Mick WA LA
3 Paul OR OR
4 Ron AZ AZ
答案 0 :(得分:1)
使用dplyr包,您可以使用coalesce()
完成工作。在这里,我将因子转换为字符。如有必要,您希望将它们重新转换为因子。第一步是合并两个数据集并将因子转换为字符。然后,您使用newstate
填写coalesce()
中的NAs。
library(dplyr)
left_join(df1, df2, by = "first") %>%
mutate_all(funs(as.character)) %>%
mutate(newstate = coalesce(newstate, state))
# first state newstate
#1 Jim CA TX
#2 Mick WA LA
#3 Paul OR OR
#4 Ron AZ AZ
答案 1 :(得分:0)
一种天真的方式是用状态值替换NA!
> state = c("CA","WA","OR","AZ")
> first = c("Jim","Mick","Paul","Ron")
> df1 <- data.frame(first, state, stringsAsFactors = F)
> df1
first state
1 Jim CA
2 Mick WA
3 Paul OR
4 Ron AZ
>
> newstate = c("TX", "LA")
> first =c("Jim","Mick")
> df2 <- data.frame(first,newstate, stringsAsFactors = F)
> df2
first newstate
1 Jim TX
2 Mick LA
>
> df3 <- merge(df1,df2, by='first', all=TRUE)
>
> #df3 <- as.character(df3)
>
> df3$newstate[is.na(df3$newstate)] <- df3$state[is.na(df3$newstate)]
>
> df3
first state newstate
1 Jim CA TX
2 Mick WA LA
3 Paul OR OR
4 Ron AZ AZ
>
答案 2 :(得分:0)
你也可以使用dplyer:
library(tidyverse)
df3 <- merge(x = df1, y = df2, all.x = TRUE) %>%
mutate(state = as.character(state),
newstate = as.character(newstate),
newstate = if_else(is.na(newstate), state, newstate))