让我们假设两个数据帧:A和B包含如下数据: 数据帧:数据帧:B
ColA1 ColA2 ColB1 ColB2
| Dog | Lion | Lion | Lion
| Lion | Dog | Cat | NA
| Zebra | Tiger | Tiger | Tiger
| Bat | Parrot | Dog | Dog
如果ColB1的动物存在于ColA1或ColA2中,则将来自' ColB2'的该动物的名称插入ColB2,否则插入NA。
而不是两次运行ifelse函数两次:
B$ColB2<- ifelse((B$ColB1 %in% A$ColA1 | B$ColB1 %in% AColA2), "animal from ColA1" , NA)
这怎么会变短?通过应用apply函数,它会变得更快吗?
答案 0 :(得分:2)
索引的使用也是选项:
i <- dfB$ColB1 %in% unlist(dfA)
dfB$ColB2[i] <- as.character(dfB$ColB2[i])
结果:
> dfB
ColB1 ColB2
1 Lion Lion
2 Cat NA
3 Tiger Tiger
4 Dog Dog
答案 1 :(得分:1)
您可以尝试使用dplyr
:
library(dplyr)
dfB %>%
mutate(colB3 = if_else(ColB1 %in% unlist(dfA), ColB1, NULL))
给出:
ColB1 ColB2 colB3
1 Lion Lion Lion
2 Cat NA NA
3 Tiger Tiger Tiger
4 Dog Dog Dog
输入:
dput(dfA)
structure(list(ColA1 = structure(c(2L, 3L, 4L, 1L), .Label = c("Bat",
"Dog", "Lion", "Zebra"), class = "factor"), ColA2 = structure(c(2L,
1L, 4L, 3L), .Label = c("Dog", "Lion", "Parrot", "Tiger"), class = "factor")), class = "data.frame", row.names = c(NA,
-4L), .Names = c("ColA1", "ColA2"))
dput(dfB)
structure(list(ColB1 = structure(c(3L, 1L, 4L, 2L), .Label = c("Cat",
"Dog", "Lion", "Tiger"), class = "factor"), ColB2 = structure(c(2L,
3L, 4L, 1L), .Label = c("Dog", "Lion", "NA", "Tiger"), class = "factor")), class = "data.frame", row.names = c(NA,
-4L), .Names = c("ColB1", "ColB2"))
答案 2 :(得分:1)
这可能是最简单的:
df_B$ColB2 <- ifelse(df_B$ColB1 %in% unlist(df_A[,c(1:2)]), df_B$ColB1, NA)
输出:
ColB1 ColB2
1 Lion Lion
2 Cat <NA>
3 Tiger Tiger
4 Dog Dog
要在df_A的每列中找到与df_B $ ColB1中的值匹配的单个索引,您可以使用以下内容:
x<-apply(df_A[,c(1:2)],2,function(x) sapply(df_B$ColB1, function(i) grep(i,x)))
str(x)的输出:
List of 2
$ ColA1:List of 4
..$ Lion : int 2
..$ Cat : int(0)
..$ Tiger: int(0)
..$ Dog : int 1
$ ColA2:List of 4
..$ Lion : int 1
..$ Cat : int(0)
..$ Tiger: int 3
..$ Dog : int 2