我有一个这样的日志文件:
December 20, 2015, 11:00pm
November 18, 2014, 12:00am
October 05, 2012, 11:30pm
October 02, 2012, 5:30pm
October 01, 2012, 12:30am
October 01, 2010, 11:30am
October 01, 2011, 9:30pm
October 01, 2011, 7:30am
...
我可以使用sort这样简单的日期格式:
Mar 4 07:45
Mar 8 06:45
Mar 8 05:45
sort -k1M -k2 -k3 text.txt
Mar 4 07:45
Mar 8 05:45
Mar 8 06:45
但是我不能对我的日志文件使用sort。我该怎么办?如何使用sort
或awk
或其他?
答案 0 :(得分:3)
您可以使用Bash工具将日期转换为时间戳,添加此信息,排序并将其删除:
while IFS=, read -r day year hour; do
printf "%s %s, %s, %s\n" "$(date -d"$day $year $hour" +"%s")" "$day" "$year" "$hour"
done < file | sort -n | cut -d' ' -f2-
这假设格式位于day, year, hour
格式。
让我们将日期转换为时间戳:
while IFS=, read -r day year hour;
do
printf "%s %s, %s, %s\n" "$(date -d"$day $year $hour" +"%s")" "$day" "$year" "$hour"
done < a
1450648800 December 20, 2015, 11:00pm
1416265200 November 18, 2014, 12:00am
1349472600 October 05, 2012, 11:30pm
1349191800 October 02, 2012, 5:30pm
1349044200 October 01, 2012, 12:30am
1285925400 October 01, 2010, 11:30am
1317497400 October 01, 2011, 9:30pm
让我们排序:
while IFS=, read -r day year hour;
do
printf "%s %s, %s, %s\n" "$(date -d"$day $year $hour" +"%s")" "$day" "$year" "$hour"
done < a | sort -n
1285925400 October 01, 2010, 11:30am
1317497400 October 01, 2011, 9:30pm
1349044200 October 01, 2012, 12:30am
1349191800 October 02, 2012, 5:30pm
1349472600 October 05, 2012, 11:30pm
1416265200 November 18, 2014, 12:00am
1450648800 December 20, 2015, 11:00pm
让我们删除临时时间戳:
$ while IFS=, read -r day year hour;
do
printf "%s %s, %s, %s\n" "$(date -d"$day $year $hour" +"%s")" "$day" "$year" "$hour"
done < a | sort -n | cut -d' ' -f2-
October 01, 2010, 11:30am
October 01, 2011, 9:30pm
October 01, 2012, 12:30am
October 02, 2012, 5:30pm
October 05, 2012, 11:30pm
November 18, 2014, 12:00am
December 20, 2015, 11:00pm
答案 1 :(得分:3)
只需使用awk从每个输入行创建一个YYYYMMDDHHMM字符串,并将其添加到每行输出,然后管道进行排序,然后剪切以删除awk前面的字符串:
$ cat tst.awk
BEGIN { FS="(,? +|:)" }
{
mthAbbr = substr($1,1,3)
mthNr = (match("JanFebMarAprMayJunJulAugSepOctNovDec",mthAbbr)+2)/3
ampm = $NF; sub(/.*[0-9]/,"",ampm)
hour = $4 + ( (ampm=="pm") && ($4<12) ? 12 : 0 )
printf "%04d%02d%02d%02d%02d\t%s\n", $3, mthNr, $2, hour, $5, $0
}
$ awk -f tst.awk file | sort | cut -f2-
October 01, 2010, 11:30am
October 01, 2011, 7:30am
October 01, 2011, 9:30pm
October 01, 2012, 12:30am
October 02, 2012, 5:30pm
October 05, 2012, 11:30pm
November 18, 2014, 12:00am
December 20, 2015, 11:00pm
为了帮助您了解正在发生的事情,以下是中间步骤:
$ awk -f tst.awk file
201512202300 December 20, 2015, 11:00pm
201411181200 November 18, 2014, 12:00am
201210052330 October 05, 2012, 11:30pm
201210021730 October 02, 2012, 5:30pm
201210011230 October 01, 2012, 12:30am
201010011130 October 01, 2010, 11:30am
201110012130 October 01, 2011, 9:30pm
201110010730 October 01, 2011, 7:30am
$ awk -f tst.awk file | sort
201010011130 October 01, 2010, 11:30am
201110010730 October 01, 2011, 7:30am
201110012130 October 01, 2011, 9:30pm
201210011230 October 01, 2012, 12:30am
201210021730 October 02, 2012, 5:30pm
201210052330 October 05, 2012, 11:30pm
201411181200 November 18, 2014, 12:00am
201512202300 December 20, 2015, 11:00pm
答案 2 :(得分:2)
我记得我已经发布了类似问题的答案。然而,搜索后我找不到它。
因此,想法是计算1970-01-01之后的秒数,并将前缀作为前缀添加到原始行,然后排序,最后删除前缀字段。
<include
id="@+id/nav_view"
layout="@layout/Nav_header_main"/>
awk -v cmd='date -d"%s" +%s'
'{o=$0;gsub(/,/,"");cc=sprintf(cmd,$0,"%s");
cc|getline d
close(cc);print d"\x99"o}' file|sort -n|sed 's/.*\x99//'
是一个不可见的字符,只是为了确保它不会与文件中的现有字符冲突。
输入示例的输出:
\x99
答案 3 :(得分:2)
另一种类似的方法,使用Perl
perl -MTime::Piece -lpe '$_ = Time::Piece->strptime($_, "%B %d, %Y, %l:%M%p")->strftime("%s") . "\t" . $_' file |
sort -n |
cut -f2-
答案 4 :(得分:1)
你仍然可以通过分离复合的
来逐字段地进行$ sed 's/[ap]m/ &/;s/:/ : /' log \
| sort -k3,3 -k1,1M -k2,2 -k7 -k4,4n -k6,6 \
| sed -r 's/ : /:/;s/ ([ap]m)/\1/'
October 01, 2010, 11:30am
October 01, 2011, 7:30am
October 01, 2011, 9:30pm
October 01, 2012, 12:30am
October 02, 2012, 5:30pm
October 05, 2012, 11:30pm
November 18, 2014, 12:00am
December 20, 2015, 11:00pm
更新:感谢罗马人没有0,我们有12&lt; 1&lt; 2&lt; ...为每个meridiem(上午/下午)。修复正在用00替换12并在排序后更改回来。
$ sed 's/[ap]m/ &/;s/12:/00:/;s/:/ : /' log \
| sort -k3,3 -k1,1M -k2,2 -k7 -k4,4n -k6 \
| sed -r 's/ : /:/;s/ ([ap]m)/\1/;s/00:/12:/'
October 01, 2010, 11:30am
October 01, 2011, 7:30am
October 01, 2011, 9:30pm
October 01, 2012, 12:30am
October 02, 2012, 5:30pm
October 05, 2012, 11:30pm
November 18, 2014, 12:00am
November 18, 2015, 12:00am
November 18, 2015, 1:00am
November 18, 2015, 12:00pm
November 18, 2015, 1::00pm
December 20, 2015, 11:00pm
PS。现在质疑所选择的日志格式。
答案 5 :(得分:0)
基于Schwartzian transform解决方案的纯Perl:
say $_->[1] for sort {$a->[0] <=> $b->[0]}
map [Time::Piece->strptime($_, "%B %d, %Y, %l:%M%p")->strftime("%s"), $_], @_;
假设数组@_
包含日志文件的行。这使用{{3}}。