列出基于"的uniq行:"分隔符

时间:2015-05-14 08:45:37

标签: linux shell unix awk sed

我正在尝试编写一个脚本,它将根据列/分隔符找到唯一的行(首次出现)。在这种情况下,我的理解分隔符是":"。

例如:

May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780852: ndtpq.c(20544): this is the log  
May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780853: ndtpq.c(20544): this is another log  
May 14 00:00:02  SERVER1 ntp[1006]:  ntpd[Info]: 1430748798.780852: ndtpq.c(20544): this is another log  
May 14 00:00:03  SERVER1 ntp[1006]:  ntpd[Info]: 1430748799.780852: ndtpq.c(20544): this is the log  
May 14 00:00:04  SERVER1 ntp[1006]:  ntpd[Info]: 1430748800.780852: ndtpq.c(20544): this is the log  
May 14 00:00:04  SERVER1 ntp[1006]:  ntpd[Info]: 1430748800.790852: ndtpq.c(20544): this is the log  
May 14 00:00:05  SERVER1 ntp[1006]:  ntpd[Info]: 1430748801.790852: ndtpq.c(20544): thisis really different log  

期望的输出:

May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780852: ndtpq.c(20544): this is the log  
May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780853: ndtpq.c(20544): this is another log  
May 14 00:00:05  SERVER1 ntp[1006]:  ntpd[Info]: 1430748801.790852: ndtpq.c(20544): thisis really different log  

我可以使用以下命令找到uniq日志,但是,我正在使用这种方式丢失时间戳。

cat fileName |awk -F: '{print $7}'

2 个答案:

答案 0 :(得分:2)

这可能会:

awk -F: '!seen[$NF]++' file
May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780852: ndtpq.c(20544): this is the log
May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780853: ndtpq.c(20544): this is another log
May 14 00:00:05  SERVER1 ntp[1006]:  ntpd[Info]: 1430748801.790852: ndtpq.c(20544): thisis really different log

它使用:拆分文件,然后查看最后一个字段,并仅打印唯一字段。

答案 1 :(得分:1)

试试这个

<强> awk中

 awk -F: '!x[$NF]++' infile
如果订单无关紧要,

GNU排序

 sort -u -t: -k7 infile