如何修复R中匹配元素的返回

时间:2019-01-22 13:27:29

标签: r match

我有两个数据框,我想将它们匹配在一起并返回该元素,但是我遇到的问题是,如果存在匹配,它将返回第一个元素,而不是匹配的元素。

我尝试了这段代码,但是它返回了元素的索引

data3 <- data1 %>%
   mutate(score = lapply(M,function(x){ifelse (f<-(match(x, data2$words)), f )}))

  Term score
1    I    NA
2  A B  3, 7
3    A     3
4    Z    NA
5    D     4
6    B     7 

这是代码:

data1 <- data.frame(Term = c("I","A B","A","Z","D","B"))#txt
library(stringr)
M<-str_split(data1$Term , pattern = "\\s+")
data2 <- data.frame(words = c("O","C","A","D","E","F","B"))#dec
data3 <- data1 %>% mutate(score = lapply(M,function(x){ifelse (match(x, data2$words), data2$words )}))```

#here is the result I got 

 Term score
1    I    NA
2  A B  O, C
3    A     O
4    Z    NA
5    D     O
6    B     O

#and as you can see if there is a match it returns the first element.

#the result I expected 

  Term score
1    I    NA
2  A B  A, B
3    A     A
4    Z    NA
5    D     D
6    B     B

1 个答案:

答案 0 :(得分:1)

获取索引后,使用该索引从“ data2”中获取相应的“单词”,将元素paste放入非NA元素的单个字符串(toString)中,并更改空白(""个元素与NAdplyr::na_if

data1$score <- sapply(M, function(x) {
       x1 <- data2$words[match(x, data2$words)]
       dplyr::na_if(toString(na.omit(x1)), "")})
data1$score
#[1] NA     "A, B" "A"    NA     "D"    "B"