使用Bash脚本查找彼此相关的行

时间:2012-09-06 14:26:58

标签: bash grep

我有一个包含这样输出的日志文件:

  [mvn] Running com.mypackage.MyTest
   ...
  [mvn] Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 104.648 sec
  [mvn] Running com.mypackage.MyNotExecutedTest
   ...
  [mvn] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.525 sec
  [mvn] Running com.mypackage.AnotherNotExecutedTest
   ...
  [mvn] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.569 sec
  [mvn] Running com.mypackage.FailedTest
   ...
  [mvn] Tests run: 5, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 22.357 sec
   ...

而“......”可以是任意数量的行(例如,堆栈跟踪,某些调试输出)。我想要实现的是一个尚未执行的测试列表:

  com.mypackage.MyNotExecutedTest
  com.mypackage.AnotherNotExecutedTest

所以我的方法是grep for pattern“测试运行:0,失败:0,错误:0,跳过:0,时间已过去”但是我会以某种方式需要一种聪明的方法来找出测试属于grep模式的内容。 这里有什么好/优雅的解决方案谢谢!

3 个答案:

答案 0 :(得分:4)

编写一个存储最新awk行的Running脚本,然后在看到Tests run: 0时打印存储的行。

awk '/\[mvn\] Running /{ t=$3 }
  /\[mvn\] Tests run: 0/ { print t }'  logfile

编辑:我取出了开始锚点的行,以便正确处理缩进输入。

答案 1 :(得分:2)

我可能会将grepawk

组合使用
grep -A1 "Tests run: 0" | awk '/Running {print $NF}'

答案 2 :(得分:1)

我会用几个grep命令和一个awk来完成这一切,所有这些都是管道连接。我将引导您完成我的逻辑:

1)使用pcregrep代替grep来匹配以“Running”开头并以“Tests run:0”结尾的多行模式,如下所示:

<强>命令:

pcregrep -M "Running(\n|.)*?Tests run: 0" file.txt

(注意使用-M参数允许多行匹配,并使用星号后的?使其非贪婪)

<强>输出:

[mvn] Running com.mypackage.MyTest
...
[mvn] Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 104.648 sec
[mvn] Running com.mypackage.MyNotExecutedTest
...
[mvn] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.525 sec
[mvn] Running com.mypackage.AnotherNotExecutedTest
...
[mvn] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.569 sec

2)正如你所看到的,遗憾的是,这也与一些不需要的物品相匹配,所以我再次使用pcregrep删除违规条目,如下所示:

<强>命令:

pcregrep -M "Running(\n|.)*?Tests run: 0" file.txt | \
pcregrep -Mv "Running(\n|.)*?Tests run: [^0]"

(注意在第二个pcregrep命令中使用-v参数和[^0]字符类来仅消除运行非零数量测试的进程。

<强>输出:

[mvn] Running com.mypackage.MyNotExecutedTest
...
[mvn] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.525 sec
[mvn] Running com.mypackage.AnotherNotExecutedTest
...
[mvn] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.569 sec

3)然后我只会看到包含“Running”的行:

<强>命令:

pcregrep -M "Running(\n|.)*?Tests run: 0" file.txt | \
pcregrep -Mv "Running(\n|.)*?Tests run: [^0]" | \
grep -i running

<强>输出:

[mvn] Running com.mypackage.MyNotExecutedTest
[mvn] Running com.mypackage.AnotherNotExecutedTest

4)最后使用awk只打印我感兴趣的变量(进程名称,在你的例子中似乎总是第三行中的“单词”):

最终命令:

pcregrep -M "Running(\n|.)*?Tests run: 0" file.txt | \
pcregrep -Mv "Running(\n|.)*?Tests run: [^0]" | \
grep -i running | \
awk '{print $3};'

最终输出:

com.mypackage.MyNotExecutedTest
com.mypackage.AnotherNotExecutedTest

HTH!