考虑以下代码来计算每个单词中字母'a'的出现次数:
data <- data.frame(number=1:4, string=c("this.is.a.great.word", "Education", "Earth.Is.Round", "Pinky), stringsAsFactors = F)
library(stringr)
data$Count_of_a <- str_count(data$string, "a")
data
这会产生这样的结果:
number string Count_of_a
1 1 this.is.a.great.word 2
2 2 Education 1
3 3 Earth.Is.Round 1
4 4 Pinky 0
我试图做更多的事情:
问题是如果我使用nchar(数据$ string),它也会计算点'。' 我也找不到上述4项要求的帮助。
最终数据我想看起来像这样:
number string starts_with_vowel ends_with_vowel TotalLtrs
1 this.is.a.great.word 0 0 16
2 Education 1 0 9
3 Earth.Is.Round 1 0 12
4 Pinky 0 1 5
答案 0 :(得分:2)
您想要一组正则表达式
library(tidyverse)
data %>%
mutate(
nvowels = str_count(tolower(string), "[aeoiu]"),
total_letters = str_count(tolower(string), "\\w"),
starts_with_vowel = grepl("^[aeiou]", tolower(string)),
ends_with_vowel = grepl("[aeiou]$", tolower(string))
)
# number string nvowels total_letters starts_with_vowel ends_with_vowel
# 1 1 this.is.a.great.word 6 16 FALSE FALSE
# 2 2 Education 5 9 TRUE FALSE
# 3 3 Earth.Is.Round 5 12 TRUE FALSE
# 4 4 Pinky 1 5 FALSE FALSE
如果您将y
视为元音,请将其添加为
nvowels = str_count(tolower(string), "[aeoiuy]")
starts_with_vowel = grepl("^[aeiouy]", tolower(string))
ends_with_vowel = grepl("[aeiouy]$", tolower(string))
答案 1 :(得分:1)
library(stringr)
str_count(df$string, "a|e|i|o|u|A|E|I|O|U")
[1] 6 5 5 1
str_count(df$string, paste0(c(letters,LETTERS), collapse = "|"))
[1] 16 9 12 5
ifelse(substr(df$string, 1, 1) %in% c("a", "e", "i", "o", "u", "A", "E", "I", "O", "U"), 1, 0)
[1] 0 1 1 0
ifelse(substr(df$string, nchar(df$string), nchar(df$string)) %in% c("a", "e", "i", "o", "u", "A", "E", "I", "O", "U"), 1, 0)
[1] 0 0 0 0