R中连续标点符号的正则表达式

时间:2016-04-20 21:09:37

标签: regex r

我有一个如下所示的字符向量:

z <- c("./.", "To/TO", "my/PRP$", "starved/VBN", ",/,", "wretched/JJ") # test input

 [9992] "./."                           
 [9993] "To/TO"                         
 [9994] "my/PRP$"                       
 [9995] "starved/VBN"                   
 [9996] ",/,"
 [9997] "wretched/JJ" 

我想删除包含三个连续标点符号的所有条目,结果如下:

 [9993] "To/TO"                         
 [9994] "my/PRP$"                       
 [9995] "starved/VBN"                   
 [9997] "wretched/JJ"

我尝试过不同的正则表达式:

sub("[:punct:]/[:punct:]", "", z)

sub("[:punct:]{3}", "", z) 

带有单/双​​括号,两者都有:

 [9992] "./."                        
 [9993] "To"                         
 [9994] "my$"                        
 [9995] "starved"                    
 [9996] ",/,"                        
 [9997] "wretched"

有什么想法吗?如果问题是愚蠢的,我会事先道歉;我不是很擅长这个!

1 个答案:

答案 0 :(得分:4)

试试这个:

x <- c("./.", "To/TO", "my/PRP$", "starved/VBN", ",/,", "wretched/JJ") # test input

grep("[[:punct:]]{3}", x, value = TRUE, invert = TRUE)
## [1] "To/TO"       "my/PRP$"     "starved/VBN" "wretched/JJ"