从日志文件中过滤所需的单词

时间:2018-03-21 04:59:36

标签: shell awk sed grep

我有这样的日志。

cat log.log | grep 'count\|dagName'

2018-03-20T15:53:24,001 INFO  [HiveServer2-Background-Pool: Thread-70([])]: 
exec.Task (TezTask.java:build(355)) - Dag name: select count(*) from 
reportingperiod(Stage-1) 2018-03-20T15:53:24,369 INFO  [HiveServer2 
Background-Pool: Thread-70([])]: client.TezClient  
(TezClient.java:submitDAGSession(522)) - Submitting dag to TezSession, 
sessionName=HIVE-8216b875-c18e-4fcb-b25c-7fd6cb8efe10, 
applicationId=application_1521559442968_0003, dagName=select count(*) from 
repo(Stage-1), callerContext={ context=HIVE, 
callerType=HIVE_QUERY_ID, callerId=hive_20180320155311_aae27431-a30d-4022-
950c-c5ddb340098c }

我想从上面的日志中选择这样的值。

我想将提取的单词存储到变量中。比如

a=select count(*) from repo

我尝试了这样的命令

a=$(awk 'BEGIN{ print "" }
 /dagName\(=/{ sub(/.*count=[^[:space:]]+: /,""); q=$0 }
 /dagName:\/\//{ print "," q }' OFS=',' log.log)

但上面的命令会打印','只要。任何帮助将不胜感激。

1 个答案:

答案 0 :(得分:1)

您可以使用grep

grep -oE 'select count\(.\) from [a-zA-Z][a-zA-Z0-9]*' log.log

将其分配给变量:

result="$(grep -oE 'select count\(.\) from [a-zA-Z][a-zA-Z0-9]*' log.log)"