Question

当我尝试使用sed删除每行中的某些字符时，我希望使用sed查看要提前删除的内容，我该怎么办？例如，我有一个像下面这样的源代码文件，我想删除开头的行号：

            102.      for (int i=0; i < args.length; ++i) {
            103.        if ("-skip".equals(args[i])) {
                104.          DistributedCache.addCacheFile(new Path(args[++i]).toUri(), conf);
                105.          conf.setBoolean("wordcount.skip.patterns", true);
                106.        } else {
                107.          other_args.add(args[i]);
                108.        }
            109.      }

我可以用sed测试正则表达式，以便稍后使用's'删除（意味着用空字符串替换）？好吧，对于这个具体的例子，什么是正确的regexp删除行号。是否可以使用sed使用正确的缩进替换它们作为源代码？那将是强大的！

感谢。

Answer 1

这可能会有所帮助

sed -r 's/^\s*[0-9]+\.//' file  # Corrected as @Michael specified in the comments, no need for `g`.

默认情况下sed仅适用于BRE（基本正则表达式）。 \s表示空格，为了使用它，我们使用-r选项强制sed使用ERE（扩展正则表达式）。

^表示该行的开头。我们添加\s后跟*（表示0或更多），然后是数字类[0-9]，后跟+（表示1或更多），然后是{{ 1}}并在替换部分中删除它。请注意我们如何逃避.，因为.表示RegEx中的任何字符。因此，为了使用文字.我们逃避它。

<强>执行：

要识别匹配的部分，您可以使用类似的内容 -

[jaypal:~/Temp] cat file
102.      for (int i=0; i < args.length; ++i) {
            103.        if ("-skip".equals(args[i])) {
                104.          DistributedCache.addCacheFile(new Path(args[++i]).toUri(), conf);
                105.          conf.setBoolean("wordcount.skip.patterns", true);
                106.        } else {
                107.          other_args.add(args[i]);
                108.        }
            109.      }

[jaypal:~/Temp] sed -r 's/^\s*[0-9]\+\.//g' file
      for (int i=0; i < args.length; ++i) {
        if ("-skip".equals(args[i])) {
          DistributedCache.addCacheFile(new Path(args[++i]).toUri(), conf);
          conf.setBoolean("wordcount.skip.patterns", true);
        } else {
          other_args.add(args[i]);
        }
      }

Answer 2

由于Jaypal已经给出了一个答案，显示了ERE的变化，我会给一个人提供一个BRE：

sed 's/^[[:space:]]*[[:digit:]]\+\.//'

我还使用了POSIX字符类，以显示另一个选项，因为我发现它们更容易记住。

要确切了解您将要改变的内容，有几个合理的可能性。您可以在grep -o使用相同的正则表达式：

grep -o '^[[:space:]]*[[:digit:]]\+\.'

这将只选择与正则表达式匹配的行的部分，在这种情况下只显示缩进的行号。

另一种方法是再次使用sed。您可以使用&来指示替换字符串中的匹配文本，以便您指定所选区域，例如与明星：

sed 's/^[[:space:]]*[[:digit:]]\+\./***&***/'

给出：

***            102.***      for (int i=0; i < args.length; ++i) {
***            103.***        if ("-skip".equals(args[i])) {
***                104.***          DistributedCache.addCacheFile(new Path(args[++i]).toUri(), conf);
***                105.***          conf.setBoolean("wordcount.skip.patterns", true);
***                106.***        } else {
***                107.***          other_args.add(args[i]);
***                108.***        }
***            109.***      }

sed - 在进行实际替换之前将匹配的文本作为测试

2 个答案: