使用行号打印每个唯一正则表达式匹配的第一次出现

时间:2016-09-15 11:08:16

标签: bash shell sh

鉴于正则表达式,我想使用bash打印每个唯一匹配的第一次出现及其行号。

例如,假设正则表达式是.*Exception,我想要打印,

$./script.sh file.log
6255:2016-09-07 10:05:37,886 ERROR some text java.lang.IllegalMonitorStateException
6714:2016-09-07 10:12:09,514 ERROR some text java.lang.NullPointerException
7013:2016-09-07 10:19:19,950 ERROR some text java.lang.IllegalStateException

我提出了一个版本,但速度很慢:((在git-bash上)。有关如何提高性能的任何指示都表示赞赏。

FILE_NAME=$1

while read line
do
    grep "$line" "$FILE_NAME" -m1 -n
done < <(grep '\b[^ ]*Exception\b' "$FILE_NAME" | sort -u) | sort -n

更新(添加样本数据):

2016-09-07 23:58:55,674 ERROR [STDERR] (pool-18-thread-1) Continuing ...
2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.InstantiationException: java.sql.Timestamp
2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) Continuing ...
2016-09-07 23:56:16,273 WARN  [com.arjuna.ats.jta.logging.loggerI18N] (Thread-12) [com.arjuna.ats.internal.jta.recovery.xarecovery1] Local XARecoveryModule.xaRecovery  got XA exception javax.transaction.xa.XAException, XAException.XAER_RMERR
2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.RuntimeException: failed to evaluate: <unbound>=Class.new();
2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) Continuing ...
2016-09-07 23:58:26,304 WARN  [com.arjuna.ats.jta.logging.loggerI18N] (Thread-12) [com.arjuna.ats.internal.jta.recovery.xarecovery1] Local XARecoveryModule.xaRecovery  got XA exception javax.transaction.xa.XAException, XAException.XAER_RMERR

上面应该产生:

2:2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.InstantiationException: java.sql.Timestamp
4:2016-09-07 23:56:16,273 WARN  [com.arjuna.ats.jta.logging.loggerI18N] (Thread-12) [com.arjuna.ats.internal.jta.recovery.xarecovery1] Local XARecoveryModule.xaRecovery  got XA exception javax.transaction.xa.XAException, XAException.XAER_RMERR
5:2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.RuntimeException: failed to evaluate: <unbound>=Class.new();

2 个答案:

答案 0 :(得分:1)

$ cat ip.txt 
2016-09-07 23:58:55,674 ERROR [STDERR] (pool-18-thread-1) Continuing ...
2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.InstantiationException: java.sql.Timestamp
2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) Continuing ...
2016-09-07 23:56:16,273 WARN  [com.arjuna.ats.jta.logging.loggerI18N] (Thread-12) [com.arjuna.ats.internal.jta.recovery.xarecovery1] Local XARecoveryModule.xaRecovery  got XA exception javax.transaction.xa.XAException, XAException.XAER_RMERR
2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.RuntimeException: failed to evaluate: <unbound>=Class.new();
2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) Continuing ...
2016-09-07 23:58:26,304 WARN  [com.arjuna.ats.jta.logging.loggerI18N] (Thread-12) [com.arjuna.ats.internal.jta.recovery.xarecovery1] Local XARecoveryModule.xaRecovery  got XA exception javax.transaction.xa.XAException, XAException.XAER_RMERR

$ perl -ne '($e)=/(\w+Exception)/; print "$.:$_" if !$seen{$e}++ && /Exception/' ip.txt
2:2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.InstantiationException: java.sql.Timestamp
4:2016-09-07 23:56:16,273 WARN  [com.arjuna.ats.jta.logging.loggerI18N] (Thread-12) [com.arjuna.ats.internal.jta.recovery.xarecovery1] Local XARecoveryModule.xaRecovery  got XA exception javax.transaction.xa.XAException, XAException.XAER_RMERR
5:2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.RuntimeException: failed to evaluate: <unbound>=Class.new();
  • ($e)=/(\w+Exception)/$e变量
  • 中保存例外类型
  • !$seen{$e}++确保只打印与该例外匹配的第一行
  • && /Exception/仅打印包含Exception
  • 的行
  • print "$.:$_"打印行号,:和输入行


修改

这应该工作得太快......

perl -ne 'if(/(\w+Exception)/){print "$.:$_" if !$seen{$1}++}' ip.txt

答案 1 :(得分:1)

在Gnu awk:

$ awk '/Exception/ && !seen[gensub(/^([^ ]* ){2}/,"","g")]++ {print NR,$0}' file.log
2 2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.InstantiationException: java.sql.Timestamp
4 2016-09-07 23:56:16,273 WARN  [com.arjuna.ats.jta.logging.loggerI18N] (Thread-12) [com.arjuna.ats.internal.jta.recovery.xarecovery1] Local XARecoveryModule.xaRecovery  got XA exception javax.transaction.xa.XAException, XAException.XAER_RMERR
5 2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.RuntimeException: failed to evaluate: <unbound>=Class.new();

打印记录如果:

  • /Exception/匹配
  • &&
  • !seen[...]++密钥以前没有见过
  • 通过从开始gensub(/^([^ ]* ){2}/,"","g")移至第二个空间创建的
  • ^密钥
  • print NR,$0打印当前记录编号并记录