我有一个如下所示的字符向量:
z <- c("./.", "To/TO", "my/PRP$", "starved/VBN", ",/,", "wretched/JJ") # test input
[9992] "./."
[9993] "To/TO"
[9994] "my/PRP$"
[9995] "starved/VBN"
[9996] ",/,"
[9997] "wretched/JJ"
我想删除包含三个连续标点符号的所有条目,结果如下:
[9993] "To/TO"
[9994] "my/PRP$"
[9995] "starved/VBN"
[9997] "wretched/JJ"
我尝试过不同的正则表达式:
sub("[:punct:]/[:punct:]", "", z)
和
sub("[:punct:]{3}", "", z)
带有单/双括号,两者都有:
[9992] "./."
[9993] "To"
[9994] "my$"
[9995] "starved"
[9996] ",/,"
[9997] "wretched"
有什么想法吗?如果问题是愚蠢的,我会事先道歉;我不是很擅长这个!
答案 0 :(得分:4)
试试这个:
x <- c("./.", "To/TO", "my/PRP$", "starved/VBN", ",/,", "wretched/JJ") # test input
grep("[[:punct:]]{3}", x, value = TRUE, invert = TRUE)
## [1] "To/TO" "my/PRP$" "starved/VBN" "wretched/JJ"