Question

我有两个向量：

 a <- letters[1:5]
 b <- c('a','k','w','p','b','b')

现在我想计算向量a中的每个字母在b中显示的次数。我想得到：

 # 1  2  0  0  0

我该怎么办？

Answer 1

tabulate适用于整数向量并且速度很快;将您的字母与可能的字母的世界相匹配，然后将索引制成表格;使用length(a)确保每个可能值都有一个计数。

> tabulate(match(b, a), length(a))
 [1] 1 2 0 0 0

这比'明显'的table（）解决方案

更快

library(microbenchmark)
f0 = function() table(factor(b,levels=a))
f1 = function() tabulate(match(b, a), length(a))

然后

> microbenchmark(f0(), f1())
Unit: microseconds
 expr     min       lq  median       uq     max neval
 f0() 566.824 576.2985 582.950 594.4200 798.275   100
 f1()  56.816  60.0180  63.305  65.4185 120.441   100

但也更通用，例如，matching numeric values而不强制转换为字符串表示。

Answer 2

将b变为具有a指定级别的因子。不在a中的值将变为<NA>。制表时，它们将被丢弃（除非您指定useNA="ifany"）。

table(factor(b,levels=a))

a b c d e 
1 2 0 0 0

Answer 3

>sapply(a, function(x) sum(x==b))

a b c d e 
1 2 0 0 0

替代解决方案。可以修改匿名函数以实现与诸如stringdist

之类的包的模糊名称匹配

如何在两个向量中找到相同元素的数量？

3 个答案: