我正在尝试替换数据框的某些元素,如果它们与另一个数据框的元素匹配。
DF1:
V1 V2 V3
10 JP_00267-008 JP_00267-008 Line
11 JP_00302-049 JP_00302-049 Line
12 4FP3188 4FP3188 Line
13 JP_00284-029 JP_00284-029 Line
14 JP_00268-005 JP_00268-005 Line
15 JP_00265-057 JP_00265-057 Line
16 JP_00286-010 JP_00286-010 Line
17 JP_00283-008 JP_00283-008 Line
18 JP_00330-298 JP_00330-298 Line
19 JP_00269-035 JP_00269-035 Line
20 JP_00300-106 JP_00300-106 Line
DF2:
V1 V2
10 JP_00267-008 4FP3428
11 JP_00302-049 4FP5103
13 JP_00284-029 4FP4137
14 JP_00268-005 4FP3465
15 JP_00265-057 4FP3367
16 JP_00286-010 4FP4245
17 JP_00283-008 4FP4085
18 JP_00330-298 4PP3992
19 JP_00269-035 4FP3575
20 JP_00300-106 4FP4963
我想要的输出是:
V1 V2 V3
10 4FP3428 JP_00267-008 Line
11 4FP5103 JP_00302-049 Line
12 4FP3188 4FP3188 Line
13 4FP4137 JP_00284-029 Line
14 4FP3465 JP_00268-005 Line
15 4FP3367 JP_00265-057 Line
16 4FP4245 JP_00286-010 Line
17 4FP4085 JP_00283-008 Line
18 4PP3992 JP_00330-298 Line
19 4FP3575 JP_00269-035 Line
20 4FP4963 JP_00300-106 Line
但我得到的是:
V1 V2 V3
10 4FP3428 JP_00267-008 Line
11 4FP5103 JP_00302-049 Line
12 <NA> 4FP3188 Line
13 4FP4137 JP_00284-029 Line
14 4FP3465 JP_00268-005 Line
15 4FP3367 JP_00265-057 Line
16 4FP4245 JP_00286-010 Line
17 4FP4085 JP_00283-008 Line
18 4PP3992 JP_00330-298 Line
19 4FP3575 JP_00269-035 Line
20 4FP4963 JP_00300-106 Line
这是我使用的代码:
df1[,1] <- df2[match(as.character(unlist(df1[,1])), as.character(df2[[1]])), 2]
任何人都可以帮助我没有NA并且拥有原始元素吗?
提前致谢
答案 0 :(得分:3)
如果你想坚持使用基础R,请使用
# an index which includes missing values
idx <- match(as.character(unlist(df1[,1])), as.character(df2[[1]]))
# an index of the non-missing values in `idx`
idx_not_missing <- !is.na(idx)
# push the data only when the index `idx` is not missing
df1[idx_not_missing,1] <- df2[idx[idx_not_missing], 2]
答案 1 :(得分:1)
以下是使用data.table
library(data.table)
setkey(setDT(df1), V1)[df2, V1:=i.V2][]
# V1 V2 V3
# 1: 4FP3188 4FP3188 Line
#2: 4FP3367 JP_00265-057 Line
#3: 4FP3428 JP_00267-008 Line
#4: 4FP3465 JP_00268-005 Line
#5: 4FP3575 JP_00269-035 Line
#6: 4FP4085 JP_00283-008 Line
#7: 4FP4137 JP_00284-029 Line
#8: 4FP4245 JP_00286-010 Line
#9: 4FP4963 JP_00300-106 Line
#10: 4FP5103 JP_00302-049 Line
#11: 4PP3992 JP_00330-298 Line
或使用dplyr
library(dplyr)
left_join(df1, df2, by='V1') %>%
mutate(V2.y= ifelse(is.na(V2.y), V1, V2.y)) %>%
select(-V1) %>%
rename(V1=V2.y, V2=V2.x)
# V2 V3 V1
#1 JP_00267-008 Line 4FP3428
#2 JP_00302-049 Line 4FP5103
#3 4FP3188 Line 4FP3188
#4 JP_00284-029 Line 4FP4137
#5 JP_00268-005 Line 4FP3465
#6 JP_00265-057 Line 4FP3367
#7 JP_00286-010 Line 4FP4245
#8 JP_00283-008 Line 4FP4085
#9 JP_00330-298 Line 4PP3992
#10 JP_00269-035 Line 4FP3575
#11 JP_00300-106 Line 4FP4963