计算字符的出现次数并排除丢失

时间:2016-03-03 16:22:25

标签: regex r stringr

我有这样的数据集

aa<-structure(c("AABB", "AABB", NA, "AABB", "AABB", "AABB", "AABB", 
            "AABB", "AABB", "AABB", "AAAA", "AAAA", NA, "AAAA", "AAAA", "AAAA", 
            "AAAA", "AAAA", "AAAA", "AAAA", "BBBB", NA, NA, NA, "AAAA", "AAAA", 
            NA, NA, NA, NA, "AAAA", NA, NA, NA, "AAAA", "BBBB", NA, NA, NA, 
            NA, "AABB", NA, NA, NA, "AABB", "AAAA", NA, NA, NA, NA, "AAAA", 
            "AAAA", "AAAA", "BBBB", "AAAA", "BBBB", "BBBB", "BBBB", "BBBB", 
            "BBBB", "AABB", "AABB", "AABB", "AAAA", "AABB", "AAAA", "AABB", 
            "AAAA", "AAAA", "AAAB", "BBBB", "BBBB", NA, "AABB", "AABB", "AABB", 
            "AABB", "AABB", "AABB", "AABB", "AAAA", "AAAA", NA, "AAAA", "AAAA", 
            "AAAA", "AAAA", "AAAA", "AAAA", "AAAA", "BBBB", "BBBB", NA, "BBBB", 
            "BBBB", "AAAA", "AAAA", "BBBB", "BBBB", "ABBB"), .Dim = c(10L, 10L))

我试图在每个中计算“A”。我尝试了两种方法。

<{1}} str_count个包的

stringr

> apply(aa,2,str_count,"A")
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    2    4    0    4    2    4    2    0    4     0
 [2,]    2    4    1    1    1    4    2    0    4     0
 [3,]    1    1    1    1    1    4    2    1    1     1
 [4,]    2    4    1    1    1    0    4    2    4     0
 [5,]    2    4    4    4    2    4    2    2    4     0
 [6,]    2    4    4    0    4    0    4    2    4     4
 [7,]    2    4    1    1    1    0    2    2    4     4
 [8,]    2    4    1    1    1    0    4    2    4     0
 [9,]    2    4    1    1    1    0    4    2    4     0
[10,]    2    4    1    1    1    0    3    2    4     1

我在1失踪了。但我喜欢NA

regex

dosage<-function(string,char){

  x<-sapply(regmatches(string, gregexpr(char, string)), length)
  return(x)
}

apply(aa,2,dosage,"A")

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    2    4    0    4    2    4    2    0    4     0
 [2,]    2    4    0    0    0    4    2    0    4     0
 [3,]    0    0    0    0    0    4    2    0    0     0
 [4,]    2    4    0    0    0    0    4    2    4     0
 [5,]    2    4    4    4    2    4    2    2    4     0
 [6,]    2    4    4    0    4    0    4    2    4     4
 [7,]    2    4    0    0    0    0    2    2    4     4
 [8,]    2    4    0    0    0    0    4    2    4     0
 [9,]    2    4    0    0    0    0    4    2    4     0
[10,]    2    4    0    0    0    0    3    2    4     1

我得到0,但我又想NA

我该怎么做?

1 个答案:

答案 0 :(得分:1)

您可以直接$stateProvider将其格式化为str_count

matrix