Question

我知道这是一个非常天真的问题，但我尝试了很多，但没有找到一种方法来计算R中字符串中指定子字符串的出现次数。

例如：

str <- "Hello this is devavrata! here, say again hello"

现在我想查找hello的出现次数，忽略大小写。在这个例子中，答案应该是2.
编辑：我想知道，当我找到ello th然后str_count会发生1但是我希望确切的单词包含空格出现意味着它应该给zero。例如，如果我想在特定的字符串中找到very good，例如： -

It is very good to speak like thevery good

此处出现的1不是2。我希望你明白。

Answer 1

您也可以尝试：

 library(stringi)
  stri_count(str, regex="(?i)hello")
  #[1] 2


  str1 <- "It is very good to speak like thevery good"
  stri_count(str1, regex="\\b(?i)very good\\b")
 #[1] 1

Answer 2

也许最简单，最直接的方法是使用str_count中的stringr

str <- "Hello this is devavrata! here, say again hello"
library(stringr)
str_count(str, ignore.case("hello"))
# [1] 2

两种基本R方法

length(grep("hello", strsplit(str, " ")[[1]], ignore.case = TRUE))
# [1] 2

和

sum(gregexpr("hello", str, ignore.case = TRUE)[[1]] > 0)
# [1] 2

Answer 3

我迟到了，但我认为termco包中的qdap函数完全符合您的要求。您可以使用前导和/或尾随空格来控制字边界，如下例所示：

x <- c("Hello this is devavrata! here, say again hello",
    "It is very good to speak like thevery good")

library(qdap)
(out <- termco(x, id(x), list("hello", "very good", " very good ")))

##   x word.count     hello very good very good
## 1 1          8 2(25.00%)         0         0
## 2 2          9         0 2(22.22%) 1(11.11%)

## To get a data frame of pure counts:
out %>% counts()

##   x word.count hello very good very good
## 1 1          8     2         0         0
## 2 2          9     0         2         1

如何在R中的字符中查找字符串

3 个答案: