我正在尝试编写一个脚本,它将根据列/分隔符找到唯一的行(首次出现)。在这种情况下,我的理解分隔符是":"。
May 14 00:00:01 SERVER1 ntp[1006]: ntpd[Info]: 1430748797.780852: ndtpq.c(20544): this is the log
May 14 00:00:01 SERVER1 ntp[1006]: ntpd[Info]: 1430748797.780853: ndtpq.c(20544): this is another log
May 14 00:00:02 SERVER1 ntp[1006]: ntpd[Info]: 1430748798.780852: ndtpq.c(20544): this is another log
May 14 00:00:03 SERVER1 ntp[1006]: ntpd[Info]: 1430748799.780852: ndtpq.c(20544): this is the log
May 14 00:00:04 SERVER1 ntp[1006]: ntpd[Info]: 1430748800.780852: ndtpq.c(20544): this is the log
May 14 00:00:04 SERVER1 ntp[1006]: ntpd[Info]: 1430748800.790852: ndtpq.c(20544): this is the log
May 14 00:00:05 SERVER1 ntp[1006]: ntpd[Info]: 1430748801.790852: ndtpq.c(20544): thisis really different log
May 14 00:00:01 SERVER1 ntp[1006]: ntpd[Info]: 1430748797.780852: ndtpq.c(20544): this is the log
May 14 00:00:01 SERVER1 ntp[1006]: ntpd[Info]: 1430748797.780853: ndtpq.c(20544): this is another log
May 14 00:00:05 SERVER1 ntp[1006]: ntpd[Info]: 1430748801.790852: ndtpq.c(20544): thisis really different log
我可以使用以下命令找到uniq日志,但是,我正在使用这种方式丢失时间戳。
cat fileName |awk -F: '{print $7}'
答案 0 :(得分:2)
这可能会:
awk -F: '!seen[$NF]++' file
May 14 00:00:01 SERVER1 ntp[1006]: ntpd[Info]: 1430748797.780852: ndtpq.c(20544): this is the log
May 14 00:00:01 SERVER1 ntp[1006]: ntpd[Info]: 1430748797.780853: ndtpq.c(20544): this is another log
May 14 00:00:05 SERVER1 ntp[1006]: ntpd[Info]: 1430748801.790852: ndtpq.c(20544): thisis really different log
它使用:
拆分文件,然后查看最后一个字段,并仅打印唯一字段。
答案 1 :(得分:1)
试试这个
<强> awk中强>
awk -F: '!x[$NF]++' infile
如果订单无关紧要,GNU排序
sort -u -t: -k7 infile