删除与sed匹配正则表达式的第一个文本块

时间:2017-08-10 21:10:21

标签: regex bash awk sed

我有这样的文字

# This configuration was generated by
# `rubocop --auto-gen-config`

# Offense count: 1
# Configuration parameters: Include.
# Include: **/Gemfile, **/gems.rb
Bundler/DuplicatedGem:
  Exclude:
    - 'Gemfile'

# Offense count: 24
# Cop supports --auto-correct.
# Configuration parameters: Include, TreatCommentsAsGroupSeparators.
# Include: **/Gemfile, **/gems.rb
Bundler/OrderedGems:
  Exclude:
    - 'Gemfile'

# Offense count: 1
# Cop supports --auto-correct.
Layout/MultilineBlockLayout:
  Exclude:
    - 'test/unit/github_fetcher/issue_comments_test.rb'

# Offense count: 1
# Cop supports --auto-correct.
# Configuration parameters: EnforcedStyle, SupportedStyles.
# SupportedStyles: symmetrical, new_line, same_line
Layout/MultilineHashBraceLayout:
  Exclude:
    - 'config/environments/production.rb'

我希望只删除以Offense count开头的第一个文本块。我有a working regex/^# Offense([\s\S]+?)\n\n/m

如果我使用sed我有错误:

$ sed -e '/^# Offense([\s\S]+?)\n\n\/d' .rubocop_todo.yml
sed: 1: "/^# Offense([\s\S]+?)\n ...": unterminated regular expression

如果我将空字符串作为第一个参数,它什么都不做:

$ sed -e '' '/^# Offense([\s\S]+?)\n\n\/d' .rubocop_todo.yml

为什么失败?我该怎么办?

我在awk version 20070501GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.5, GNU MP 6.1.2)

的osx上

2 个答案:

答案 0 :(得分:3)

使用awk:

awk 'BEGIN{RS=ORS="\n\n"}!/^# Offense/||a++' file

细节:

BEGIN {             # before starting to read the records
    RS=ORS="\n\n"   # define the record separator(RS) and the output record
                    # separator(ORS) 
}

# condition: when it's true, the record is printed
!/^# Offense/ # doesn't start with "# Offense"
||            # OR
a++           # "a" is true ( at the first block that starts with "# Offense", "a"
              # isn't defined and evaluated as false, then it is incremented and
              # evaluated as true for the next blocks.)

答案 1 :(得分:1)

塞德说"未终止的正则表达"因为最后一个斜杠在它前面有一个反斜杠:\/将转义此最终斜杠并将该字符串作为正则表达式呈现无效。

我认为你可以在这个Perl单行中做到这一点:

perl -0pe 's/# Offense.*?\n\n//s' test.yml

其中:-0将记录分隔符设置为null,有效地在一个字符串中读取整个内容,-p打印结果(如果要在原地替换它,请添加{{1} },即-i),perl -i -0pe ...将下一个字符串视为正则表达式。 -e使这个非贪婪,所以只有第一部分匹配。 *?修饰符也会使点匹配换行符。

输出:

/s