Question

我有两个长名称向量（ list.1 ， list.2 ）。我想运行一个循环来检查list.2中的任何名称是否与list.1中的任何名称匹配。如果是的话，我想在向量列表中附加向量结果匹配名称的位置值。

 for (i in list.2){
  for (j in list.1){
    if(length(grep(list.2[i], list.1[j]), ignore.case=TRUE)==0){
      append(result, j)
      break
    } else append(nameComment.corresponding, 0)
  }
}

上面的代码非常强大，因为我的矢量名称长度为5,000和60,000，所以它可能会运行超过360,000,000个循环。我怎么能改进它？

Answer 1

which和%in%可能适合此任务，或match取决于您的目标。需要注意的一点是match返回它的第二个参数中第一个参数的第一个匹配的索引（也就是说，如果在查找表中只有第一个参数，那么匹配将返回）：

set.seed(123)
#  I am assuming these are the values you want to check if they are in the lookup 'table'
list2 <- sample( letters[1:10] , 10 , repl = T )
[1] "c" "h" "e" "i" "j" "a" "f" "i" "f" "e"

#  I am assuming this is the lookup table
list1 <- letters[1:3]
[1] "a" "b" "c"

#  Find which position in the lookup table each value is, NA if no match
match(list2 , list1 )
[1]  3 NA NA NA NA  1 NA NA NA NA

Answer 2

这完全是set-operations intersect/union/setdiff()的用途：

list.1 = c('Alan','Bill','Ted','Alice','Carol')
list.2 = c('Carol','Ted')
intersect(list.1, list.2)
 "Ted" "Carol"

...或者如果你真的希望索引进入list.1：

match(intersect(list.1, list.2), list.1)
  3 5

如何在两个名称列表中找到匹配项

2 个答案: