Question

我有这样的文字

# This configuration was generated by
# `rubocop --auto-gen-config`

# Offense count: 1
# Configuration parameters: Include.
# Include: **/Gemfile, **/gems.rb
Bundler/DuplicatedGem:
  Exclude:
    - 'Gemfile'

# Offense count: 24
# Cop supports --auto-correct.
# Configuration parameters: Include, TreatCommentsAsGroupSeparators.
# Include: **/Gemfile, **/gems.rb
Bundler/OrderedGems:
  Exclude:
    - 'Gemfile'

# Offense count: 1
# Cop supports --auto-correct.
Layout/MultilineBlockLayout:
  Exclude:
    - 'test/unit/github_fetcher/issue_comments_test.rb'

# Offense count: 1
# Cop supports --auto-correct.
# Configuration parameters: EnforcedStyle, SupportedStyles.
# SupportedStyles: symmetrical, new_line, same_line
Layout/MultilineHashBraceLayout:
  Exclude:
    - 'config/environments/production.rb'

我希望只删除以Offense count开头的第一个文本块。我有a working regex：/^# Offense([\s\S]+?)\n\n/m

如果我使用sed我有错误：

$ sed -e '/^# Offense([\s\S]+?)\n\n\/d' .rubocop_todo.yml
sed: 1: "/^# Offense([\s\S]+?)\n ...": unterminated regular expression

如果我将空字符串作为第一个参数，它什么都不做：

$ sed -e '' '/^# Offense([\s\S]+?)\n\n\/d' .rubocop_todo.yml

为什么失败？我该怎么办？

我在awk version 20070501或GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.5, GNU MP 6.1.2)

的osx上

Answer 1

使用awk：

awk 'BEGIN{RS=ORS="\n\n"}!/^# Offense/||a++' file

细节：

BEGIN {             # before starting to read the records
    RS=ORS="\n\n"   # define the record separator(RS) and the output record
                    # separator(ORS) 
}

# condition: when it's true, the record is printed
!/^# Offense/ # doesn't start with "# Offense"
||            # OR
a++           # "a" is true ( at the first block that starts with "# Offense", "a"
              # isn't defined and evaluated as false, then it is incremented and
              # evaluated as true for the next blocks.)

Answer 2

塞德说＆＃34;未终止的正则表达＆＃34;因为最后一个斜杠在它前面有一个反斜杠：\/将转义此最终斜杠并将该字符串作为正则表达式呈现无效。

我认为你可以在这个Perl单行中做到这一点：

perl -0pe 's/# Offense.*?\n\n//s' test.yml

其中：-0将记录分隔符设置为null，有效地在一个字符串中读取整个内容，-p打印结果（如果要在原地替换它，请添加{{1} }，即-i），perl -i -0pe ...将下一个字符串视为正则表达式。 -e使这个非贪婪，所以只有第一部分匹配。 *?修饰符也会使点匹配换行符。

输出：

/s

删除与sed匹配正则表达式的第一个文本块

2 个答案: