省略字符串时并行递归Grep

时间:2019-07-10 12:57:17

标签: linux unix recursion parallel-processing grep

问题

我无法执行递归并行zgrep,同时还从结果中省略了一个字符串。

我正在查看大约640GB的压缩防火墙日志,并且几乎%30的行中包含字符串“ Duplicate SYN”(我想省略)

递归并行Zgrep(不省略字符串)-成功

我能够像这样成功地执行并行递归grep

find /var/logs/syslog -name \* -print0 | xargs -0 -n 1 -P 36 zgrep -f foo.txt > /tmp/bar.txt

foo.txt的内容:

10\.10\.0\.28
10\.10\.3\.41
10\.10\.0\.46
10\.10\.5\.47
10\.11\.0\.48
10\.10\.0\.49
10\.144\.41\.145
10\.122\.41\.241

示例输出

Apr 18 01:39:33 ASAFW01 : %ASA-4-419002: Duplicate TCP SYN from inside:10.10.0.28/61763 to inside:10.122.41.241/8443 with different initial sequence number
Apr 18 01:39:33 ASAFW01 : %ASA-4-419002: Duplicate TCP SYN from inside:10.10.0.28/61763 to inside:10.122.41.241/8443 with different initial sequence number
May 31 02:58:46 ASAFW01 : %ASA-6-302014: Teardown TCP connection 461681145 for DMZ_EXP_INSIDE:10.122.41.241/7400 to inside:10.5.91.50/30378 duration 0:00:00 bytes 0 Failover primary closed
May 31 02:58:47 ASAFW01 : %ASA-6-302013: Built outbound TCP connection 1962428108 for DMZ_EXP_INSIDE:10.122.41.241/7400 (10.122.41.241/7400) to inside:10.11.0.48/33990 (10.11.0.48/33990)
May 31 02:58:47 ASAFW01 : %ASA-6-302014: Teardown TCP connection 1962428108 for DMZ_EXP_INSIDE:10.122.41.241/7400 to inside:10.11.0.48/33990 duration 0:00:00 bytes 3188 TCP Reset-O from DMZ_EXP_INSIDE
May 31 02:58:49 ASAFW01 : %ASA-6-106015: Deny TCP (no connection) from 10.11.0.48/35976 to 10.122.41.241/7400 flags RST  on interface inside
May 31 02:58:49 ASAFW01 : %ASA-6-106015: Deny TCP (no connection) from 10.11.0.48/35976 to 10.122.41.241/7400 flags RST  on interface inside

我要忽略的输出

Apr 18 01:39:33 ASAFW01 : %ASA-4-419002: Duplicate TCP SYN from inside:10.10.0.28/61763 to inside:10.122.41.241/8443 with different initial sequence number
Apr 18 01:39:33 ASAFW01 : %ASA-4-419002: Duplicate TCP SYN from inside:10.10.0.28/61763 to inside:10.122.41.241/8443 with different initial sequence number

想法

  • 修改foo.txt以使用省略了“ Duplicate”一词的正则表达式(不确定如何执行此操作)
  • 以递归方式删除所有640GB压缩日志文件中所有包含“重复”一词的行(使用sed?)

递归并行Zgrep(省略字符串)-失败

但是,当我尝试从结果中排除某些内容时,会出现错误。

导致错误的命令:

find /var/logs/syslog -name \* -print0 | xargs -0 -n 1 -P 36 zgrep -f foo.txt -v Duplicate > /tmp/bar.txt

gzip: Duplicate.gz: No such file or directory
gzip: Duplicate.gz: No such file or directory
gzip: Duplicate.gz: No such file or directory
gzip: Duplicate.gz: No such file or directory
gzip: Duplicate.gz: No such file or directory
gzip: Duplicate.gz: No such file or directory

0 个答案:

没有答案