对于我已经这样做的数字:
digits <- c("0","1","2","3","4","5","6","7","8","9")
答案 0 :(得分:4)
您可以使用[:punct:]
来检测标点符号。这会检测到
[!"\#$%&'()*+,\-./:;<=>?@\[\\\]^_`{|}~]
在grepexpr
x = c("we are friends!, Good Friends!!")
gregexpr("[[:punct:]]", x)
R> gregexpr("[[:punct:]]", x)
[[1]]
[1] 15 16 30 31
attr(,"match.length")
[1] 1 1 1 1
attr(,"useBytes")
[1] TRUE
或通过stringi
# Gives 4
stringi::stri_count_regex(x, "[:punct:]")
请注意,,
被视为标点符号。
问题似乎是关于获得特定标点符号的个别计数。 @Joba在评论中提供了一个简洁的答案:
## Create a vector of punctuation marks you are interested in
punct = strsplit('[]?!"\'#$%&(){}+*/:;,._`|~[<=>@^-]\\', '')[[1]]
计算它们出现的频率 counts = stringi :: stri_count_fixed(x,punct)
装饰矢量
setNames(counts, punct)
答案 1 :(得分:1)
您可以使用正则表达式。
stringi::stri_count_regex("amdfa, ad,a, ad,. ", "[:punct:]")
https://en.wikipedia.org/wiki/Regular_expression
也可能有所帮助。