Question

我们说我有以下文字：

name is test1 and age is test2 end
name is test3 and age is test4 end
name is test5 and age is test6 end
name is test7 and age is test8 end

我正在考虑test1，test2，......如下：

-bash$ grep -o -P "is .*? and|is .*? end" test
is test1 and
is test2 end
is test3 and
is test4 end
is test5 and
is test6 end
is test7 and
is test8 end

有没有办法可以在匹配的模式中添加一些文字？我正在寻找这样的输出：

STRING1:is test1 and
STRING2:is test2 end
STRING1:is test3 and
STRING2:is test4 end
STRING1:is test5 and
STRING2:is test6 end
STRING1:is test7 and
STRING2:is test8 end

Answer 1

我将grep的输出传输到awk以满足您的需求：

grep -o -P "is .*? and|is .*? end" test | \
awk -v a=STRING1: -v b=STRING2: "/and$/ {print a\$0} /end$/ {print b\$0}"

Answer 2

你可以在管道中使用sed（不可否认它不是很干净）：

$ grep -o -P "is .*? and|is .*? end" test | sed '/and$/s/^/STRING1:/; /end$/s/^/STRING2:/'
STRING1:is test1 and
STRING2:is test2 end
STRING1:is test3 and
STRING2:is test4 end
STRING1:is test5 and
STRING2:is test6 end
STRING1:is test7 and
STRING2:is test8 end

每次替换前/.nd$/限制替换行为与该正则表达式匹配的行。

Answer 3

由于您想要操纵而不仅仅是选择文字，sed比grep更适合工作。

构造一个执行所需替换的正则表达式非常简单。您有两个替换，因此您可以使用两个表达式（-e）。要仅对匹配的行进行操作（与grep示例一样），请使用sed -n和p操作仅打印匹配的行。棘手的部分是你想多次在同一行上操作，但是当你执行第一次替换时，你将失去第二次替换的其余字符串。例如，以下内容接近您想要的内容，但第二个表达式永远不会匹配，因为第一个表达式会删除第二个表达式匹配的字符串：

sed -n -e 's/.*\(is .* and\).*/STRING1:\1/p' -e 's/.*\(is .* end\)/STRING2:\1/p'
STRING1:is test1 and
STRING1:is test3 and
STRING1:is test5 and
STRING1:is test7 and

要解决此问题，您可以使用h和g sed命令将模式空间（输入行）复制到保留缓冲区（h）并复制它返回到下一个sed命令（g）的模式空间：

sed -n -e 'h;s/.*\(is .* and\).*/STRING1:\1/p' -e 'g;s/.*\(is .* end\)/STRING2:\1/p'
STRING1:is test1 and
STRING2:is test2 end
STRING1:is test3 and
STRING2:is test4 end
STRING1:is test5 and
STRING2:is test6 end
STRING1:is test7 and
STRING2:is test8 end

在执行第一个表达式中的替换之前，该行将保存在保持缓冲区中。第二个表达式首先使用保持缓冲区加载模式缓冲区，以便第二个替换可以工作。

您可以将这两个单独的表达式合并为一个，但我认为这使得阅读更难：

sed -n -e 'h;s/.*\(is .* and\).*/STRING1:\1/p;g;s/.*\(is .* end\).*/STRING2:\1/p'

如何在grep的输出前添加标签？

3 个答案: