
时间:2017-10-06 11:00:50

标签: r regex





c(",At the end of the study everything was great\n,There is an funny looking thing somewhere but I didn't look at it too hard\nSome other sentence\n The test ended.",",Not sure how to get this regex sorted\nI don't know how to get rid of sentences between the two nearest carriage returns but without my head spinning\nHow do I do this")


,At the end of the study everything was great\n,Some other sentence\nThe test ended.
,Not sure how to get this regex sorted\n\nHow do I do this


2 个答案:

答案 0 :(得分:1)

请注意,您混淆了“\ n”和“/ n”,我确实这样做了。


1)只需捕捉“but”之前和之后没有换行符([^ \ n])的所有字符。

2)(编辑)为了解决Wiktors发现的问题,我们还必须检查没有char([^ a-zA-Z])直接在“but”之前或之后。

x <- c(",At the end of the study everything was great\n,There is an funny looking thing somewhere but I didn't look at it too hard\nSome other sentence\n The test ended.",
       ",Not sure how to get this regex sorted\nI don't know how to get rid of sentences between the two nearest carriage returns but without my head spinning\nHow do I do this")

> gsub("[^\n]*[^a-zA-Z]but[^a-zA-Z][^\n]*", "", x)
[1] ",At the end of the study everything was great\n\nSome other sentence\n The test ended."
[2] ",Not sure how to get this regex sorted\n\nHow do I do this" 

答案 1 :(得分:1)


x <- c(",At the end of the study everything was great\n,There is an funny looking thing somewhere but I didn't look at it too hard\nSome other sentence\n The test ended.", ",Not sure how to get this regex sorted\nI don't know how to get rid of sentences between the two nearest carriage returns but without my head spinning\nHow do I do this")
gsub(".*\\bbut\\b.*[\r\n]*", "", x, ignore.case=TRUE, perl=TRUE)
gsub("(?n).*\\bbut\\b.*[\r\n]*", "", x, ignore.case=TRUE)

请参阅R demo online


  • .* - 除了换行符之外的任何0 +字符,0或更多,尽可能多
  • \\bbut\\b - 整个字but\b是字边界)
  • .* - 除了换行符之外的任何0 +字符,0或更多,尽可能多
  • [\r\n]* - 0个或更多换行符。
