如何使用awk或sed将mysql.log重新格式化为简单格式?
我有一块mysql.log:
131024 13:17:40 1 Query select * from test_numbers
1 Query select
n.id,
n.name,
n.value
from test_numbers AS n
limit 0, 50
1 Query select
count(*)
FROM test_numbers
1 Query SHOW STATUS
131024 13:17:50 1 Query SHOW STATUS
131024 13:18:00 1 Query select * from test_numbers
1 Query select
n.id,
n.name,
n.value
from test_numbers AS n
limit 0, 50
1 Query select
count(*)
FROM test_numbers
1 Query SHOW STATUS
“查询”行正在跨越多行。 我想重新格式化为这种格式:
131024 13:17:40 1 Query select * from test_numbers
131024 13:17:40 1 Query select n.id, n.name, n.value from test_numbers AS n limit 0, 50
131024 13:17:40 1 Query select count(*) FROM test_numbers
131024 13:17:40 1 Query SHOW STATUS
131024 13:17:50 1 Query SHOW STATUS
131024 13:18:00 1 Query select * from test_numbers
131024 13:18:00 1 Query select n.id, n.name, n.value from test_numbers AS n limit 0, 50
131024 13:18:00 1 Query select count(*) FROM test_numbers
131024 13:18:00 1 Query SHOW STATUS
当多行“查询”在一行上连接时。
我尝试了一些脚本,但失败了:
sed -r ':a;N;$!ba;s/\n(^(([0-9]+\t[0-9:]+)|(\t\t[0-9:]+)))/ \1/g' mysql.log
awk '/^(([0-9]+ [0-9:]+)|(\t\t[0-9:]+))/{print "";next}{printf $0}END{print "";}' mysql.log
谢谢!我修改了一些贡献。请查看发布的答案。
答案 0 :(得分:3)
这个awk
可以做你想做的事。
awk '{$1=$1} /Query/ && s {print s;s=""} /^[0-9][0-9]/ {s=$0;f=$1 " " $2} /^[0-9]+ Query/ {s=f " "$0} !/Query/ {s=s " " $0} END {print s}' t
131024 13:17:40 1 Query select * from test_numbers
131024 13:17:40 1 Query select n.id, n.name, n.value from test_numbers AS n limit 0, 50
131024 13:17:40 1 Query select count(*) FROM test_numbers
131024 13:17:40 1 Query SHOW STATUS
131024 13:17:50 1 Query SHOW STATUS
131024 13:18:00 1 Query select * from test_numbers
131024 13:18:00 1 Query select n.id, n.name, n.value from test_numbers AS n limit 0, 50
131024 13:18:00 1 Query select count(*) FROM test_numbers
131024 13:18:00 1 Query SHOW STATUS
它是如何运作的?
awk '
{
$1=$1 # reset all spacing to one space
}
/Query/ && s { # If line has name "Query" and "s" is true (to prevent fist line double"
print s # print s
s="" # reset s
}
/^[0-9][0-9]/ { # if line starts with number
s=$0 # save line in s
f=$1 " " $2 # save date time to f
}
/^[0-9]+ Query/ { # if line starts with "number + Query"
s=f " "$0 # set s date time + line
}
!/Query/ { # If line does not have "Query"
s=s " " $0 # extend s with line
}
END {
print s} # print last line
' file
答案 1 :(得分:2)
尝试以下脚本(已注释):
script.awk
的内容:
## A line with the timestamp. Get it and remove from the line for further processing.
$1 ~ /^[[:digit:]]+$/ && $2 ~ /^[[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2}$/ {
timestamp = $1 " " $2
$1 = $2 = ""
}
## A line that points to the beginning of a query. Print the previous one
## saved in "q" variable and begin to save the current one.
$0 ~ /\<[[:digit:]]+[[:blank:]]Query\>/ {
if ( q ) {
printf "%s\t%s\n", timestamp, q
q = ""
}
sub(/^[[:blank:]]+/, "")
q = $0
next
}
## A line that is a continuation of a query. Save its content removing leading
## and trailing spaces.
{
sub(/^[[:blank:]]+/, "")
sub(/[[:blank:]]+$/, "")
q = q " " $0
}
## Don't forget the last query when file ends.
END {
if ( q ) {
printf "%s\t%s\n", timestamp, q
q = ""
}
}
像以下一样运行:
awk -f script.awk infile
产量:
131024 13:17:40 1 Query select * from test_numbers
131024 13:17:40 1 Query select n.id, n.name, n.value from test_numbers AS n limit 0, 50
131024 13:17:40 1 Query select count(*) FROM test_numbers
131024 13:17:50 1 Query SHOW STATUS
131024 13:18:00 1 Query SHOW STATUS
131024 13:18:00 1 Query select * from test_numbers
131024 13:18:00 1 Query select n.id, n.name, n.value from test_numbers AS n limit 0, 50
131024 13:18:00 1 Query select count(*) FROM test_numbers
131024 13:18:00 1 Query SHOW STATUS
答案 2 :(得分:2)
$ cat tst.awk
/[[:digit:]]+ +Query/ {
if (rec) print rec
if ( match($0,/^[[:digit:]]+ [[:digit:]:]+ +/) ) {
ts = substr($0,RSTART,RLENGTH)
}
else {
sub(/^ +/,ts)
}
rec = $0
next
}
{ sub(/^ +/,""); rec = rec OFS $0 }
END { if (rec) print rec }
$ awk -f tst.awk file
131024 13:17:40 1 Query select * from test_numbers
131024 13:17:40 1 Query select n.id, n.name, n.value from test_numbers AS n limit 0, 50
131024 13:17:40 1 Query select count(*) FROM test_numbers
131024 13:17:40 1 Query SHOW STATUS
131024 13:17:50 1 Query SHOW STATUS
131024 13:18:00 1 Query select * from test_numbers
131024 13:18:00 1 Query select n.id, n.name, n.value from test_numbers AS n limit 0, 50
131024 13:18:00 1 Query select count(*) FROM test_numbers
131024 13:18:00 1 Query SHOW STATUS
答案 3 :(得分:1)
这可能适合你(GNU sed):
sed -r ':a;$!N;/^((\S{6} ..:..:..\s*)(\S+ \S+).*\n)\s*\3/{s//\1\2\3/;P;D};/^\S{6} ..:..:...*\n\S{6} ..:..:../{P;D};s/\n\s*/ /;ta' file
答案 4 :(得分:0)
#
# http://stackoverflow.com/questions/19571592/reformat-mysql-log-to-simple-format-using-awk-or-sed/19578281?noredirect=1#19578281
# modified from the original proposed by Ed Morton
#
# Mysql logs some commands: Connect, Quit, Query, Init DB
/[[:digit:]]+ +(Quit|Connect|Query|Init DB)/ {
if (rec){
print rec
}
$0=cleanTabs($0)
#replace whitespace to tab in "transaction_id command"
$0=gensub(/([[:digit:]]+) +(Quit|Connect|Query|Init DB)/, "\\1\t\\2\t", "g")
if ( match($0, /^([[:digit:]]+ +[[:digit:]:]+ )/) ) {
#ts holds "YYMMDD". now, reformat to "YYYY-MM-DD"
ts = "20" substr($0, 0, 2) "-" substr($0, 3, 2) "-" substr($0, 5, 2) substr($0, 7, 9)
#sub(/^\t+/,ts OFS)
sub(/^([[:digit:]]+ +[[:digit:]:]+ )/, ts "\t")
}
else {
sub(/^ /,ts "\t")
}
rec = $0
next
}
{ sub(/^ +/,""); rec = rec OFS $0 }
END { if (rec) print rec }
# convert whitespaces to tabs in the "line"
function cleanTabs(line){
gsub(/\t/, " ", line)
return line
}