在R(dotall)中跨越多行的gsub

时间:2015-04-07 19:23:32

标签: regex r gsub

是否可以在R中使用gsub调用dotall表达式?基本上,我试图提取跨越多行的文本部分。请考虑以下示例:

eg.df <- c("----------", " ", "keep", " ", "keep this too", " ", "----------", " ", 
   "Delete this line and everything after", "Delete this one too", 
   " ", "And delete this one")

我想使用第7-9行作为匹配的模式。我想删除这些行和随后的所有内容,直到文件结束。

[1] "----------"                           
[2] " "                                    
[3] "keep"                                 
[4] " "                                    
[5] "keep this too"                        
[6] " "                                    
[7] "----------"                           
[8] " "                                    
[9] "Delete this line and everything after"
[10] "Delete this one too"                  
[11] " "                                    
[12] "And delete this one"

因此,结果输出为:

[1] "----------"                           
[2] " "                                    
[3] "keep"                                 
[4] " "                                    
[5] "keep this too"                        
[6] " "               

1 个答案:

答案 0 :(得分:3)

你可以尝试

  strsplit(sub('-+, +,[A-Za-z]+[^-]+$', '', 
         paste(eg.df, collapse= ',')), ',')[[1]]
  #[1] "----------"    " "             "keep"          " "            
  #[5] "keep this too" " "

或@hwnd评论,

  strsplit(sub('-+[^-]+\\z', '', paste(eg.df, collapse = '_'), 
                      perl=T), '_')[[1]]