使用AWK处理多行日志以收集SQL语句

时间:2016-01-28 12:32:25

标签: bash awk

我在日志文件中有以下条目:

res/drawable/your_image.png              // drawable for normal screen size ("default")
res/drawable-small/your_image.png        // drawable for small screen size
res/drawable-large/your_image.png        // drawable for large screen size
res/drawable-xlarge/your_image.png       // drawable for extra-large screen size

我想捕获SQL语句,但我不知道如何使用AWK做到这一点。

更新

预期结果:

2016-01-25 21:12:41 UTC:172.31.21.125(56665):user@production:[21439]:ERROR:  bind message supplies 1 parameters, but
prepared statement "" requires 0

2016-01-25 21:12:41 UTC:172.31.21.125(56665):user@production:[21439]:STATEMENT:  SELECT count(*) AS total FROM (
                SELECT 1 AS count
                  FROM leads_search_criteria_entities
                  INNER JOIN entities e on entity_id = e.viq_id
                  LEFT JOIN companies_user cu ON cu.entity_id = e.viq_id
                  WHERE criterium_id = 644 AND ((
                ( cu.udef_type IS NULL -- if not set by user, check calculated value
                  AND is_university >= 50
                ) OR (
                  cu.udef_type IS NOT NULL -- if set by user, use it
                  AND cu.udef_type = 'university'
                )
              ))
                  GROUP BY e.viq_id

                  ORDER BY e.viq_id
                ) x
2016-01-25 21:14:11 UTC::@:[2782]:LOG:  checkpoint starting: time
2016-01-25 21:14:16 UTC::@:[2782]:LOG:  checkpoint complete: wrote 51 buffers (0.0%); 0 transaction log file(s) added, 0 remov
ed, 0 recycled; write=5.046 s, sync=0.038 s, total=5.091 s; sync files=18, longest=0.008 s, average=0.002 s
2016-01-25 21:19:11 UTC::@:[2782]:LOG:  checkpoint starting: time

我目前几乎正在使用的解决方案是使用sed,但这是我卡住的地方,它只是帮助过滤具有选择(自身多行)的行和之后的下一行。任何建议表示赞赏

SELECT count(*) AS total FROM ( SELECT 1 AS count FROM leads_search_criteria_entities INNER JOIN entities e on entity_id = e.viq_id LEFT JOIN companies_user cu ON cu.entity_id = e.viq_id WHERE criterium_id = 644 AND (( ( cu.udef_type IS NULL -- if not set by user, check calculated value AND is_university >= 50 ) OR ( cu.udef_type IS NOT NULL -- if set by user, use it AND cu.udef_type = 'university' ) )) GROUP BY e.viq_id ORDER BY e.viq_id ) x

3 个答案:

答案 0 :(得分:1)

$ cat log.awk
f && /^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]/ {f=0; print ""}
sub(/^.*:STATEMENT:[[:space:]]+/,"") {f=1}
f { $1=$1; printf "%s ", $0 }

$ awk -f log.awk log.txt
SELECT count(*) AS total FROM ( SELECT 1 AS count FROM leads_search_criteria_entities INNER JOIN entities e on entity_id = e.viq_id LEFT JOIN companies_user cu ON cu.entity_id = e.viq_id WHERE criterium_id = 644 AND (( ( cu.udef_type IS NULL -- if not set by user, check calculated value AND is_university >= 50 ) OR ( cu.udef_type IS NOT NULL -- if set by user, use it AND cu.udef_type = 'university' ) )) GROUP BY e.viq_id ORDER BY e.viq_id ) x

(第2行)当找到f=1时,这将打开打印(:STATEMENT:),作为副作用,将删除所有内容,直到SELECT语句开始。

(第3行)然后它保持打印直到打印关闭(见下文),通过用单个空格替换多个空格的序列来清理。 (编辑:感谢@ghoti建议优雅的$1=$1。)

(第1行)在下一个日志开始时关闭打印,以日期开头标识。打印礼貌换行以结束SELECT。

答案 1 :(得分:1)

<强>更新

如何合并sedtr

sed 's/^[0-9][^S]*//' INPUT.txt | sed '/^[0-9a-z]/d' | tr -s ' ' | tr -d '\n'

输出:

STATEMENT: SELECT count(*) AS total FROM ( SELECT 1 AS count FROM leads_search_criteria_entities INNER JOIN entities e on entity_id = e.viq_id LEFT JOIN companies_user cu ON cu.entity_id = e.viq_id WHERE criterium_id = 644 AND (( ( cu.udef_type IS NULL -- if not set by user, check calculated value AND is_university >= 50 ) OR ( cu.udef_type IS NOT NULL -- if set by user, use it AND cu.udef_type = 'university' ) )) GROUP BY e.viq_id ORDER BY e.viq_id ) x

答案 2 :(得分:1)

我不建议使用sed。首先想到的awk解决方案可能如下所示:

/^2016/&&line~/:STATEMENT:/ {
  sub(/.*:STATEMENT:/,"",line)
  print line
}
/^2016/ {
  line=""
}
{
  $1=$1
  line=sprintf("%s %s",line,$0)
}
END {
  if (line~/:STATEMENT:/) {
    sub(/.*:STATEMENT:/,"",line)
    print line
  }
}

显然你可以缩小它。我编写并运行它(用于测试)作为单行。

这里的想法是:

  • 我们会附加一个变量,每当我们的输入行以年份开头时重置它。 (你可以用匹配日期的正则表达式替换它,如果你想在未经修改的情况下明年运行它),
  • 当我们到达新的日志行(或结尾)时,我们在SQL语句之前删除了cruft并打印结果。

请注意$1=$1。这样做的目的是更改行的空格,以便将换行符和制表符和多个空格折叠为单个空格。尝试将其移除以查看影响。