如何删除括号括起来的所有字符

时间:2017-07-09 14:55:55

标签: r

假设您有一个如下所示的文本文件

I have apples, bananas, ( some pineapples over 4 ), and cherries ( coconuts with happy face :D ) and so on. You may help yourself except for cherries ( they are for my parents sorry ;C ) . I feel like I can run a fruit business.

我的目的是删除除括号括起来的所有字符。请记住,一对括号中的字符可以从英语到其他字符变化,但没有其他标点符号可以作为附加字符的作用:只允许使用括号。

我想我应该使用gsub但不确定。

这就是我想要的结果。

( some pineapples over 4 ) ( coconuts with happy face :D ) ( they are for my parents sorry ;C )

无论是使用删除还是提取的方式,我希望得到上面的结果。

1 个答案:

答案 0 :(得分:1)

我们可以通过提取括号内的子串并将paste放在一起

来实现
library(stringr)
paste(str_extract_all(str1, "\\([^)]*\\)")[[1]], collapse=' ')
#[1] "( some pineapples over 4 ) ( coconuts with happy face :D ) ( they are for my parents sorry ;C )"

或者我们可以使用基于gsub的解决方案

trimws(gsub("\\s+\\([^)]*\\)(*SKIP)(*FAIL)|.", "", str1, perl = TRUE))
#[1] "( some pineapples over 4 ) ( coconuts with happy face :D ) ( they are for my parents sorry ;C )"

数据

str1 <- "I have apples, bananas, ( some pineapples over 4 ), and cherries ( coconuts with happy face :D ) and so on. You may help yourself except for cherries ( they are for my parents sorry ;C ) . I feel like I can run a fruit business."