例如：

Question

我正在尝试编写一个脚本，它将根据列/分隔符找到唯一的行（首次出现）。在这种情况下，我的理解分隔符是＆＃34;：＆＃34;。

例如：

May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780852: ndtpq.c(20544): this is the log  
May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780853: ndtpq.c(20544): this is another log  
May 14 00:00:02  SERVER1 ntp[1006]:  ntpd[Info]: 1430748798.780852: ndtpq.c(20544): this is another log  
May 14 00:00:03  SERVER1 ntp[1006]:  ntpd[Info]: 1430748799.780852: ndtpq.c(20544): this is the log  
May 14 00:00:04  SERVER1 ntp[1006]:  ntpd[Info]: 1430748800.780852: ndtpq.c(20544): this is the log  
May 14 00:00:04  SERVER1 ntp[1006]:  ntpd[Info]: 1430748800.790852: ndtpq.c(20544): this is the log  
May 14 00:00:05  SERVER1 ntp[1006]:  ntpd[Info]: 1430748801.790852: ndtpq.c(20544): thisis really different log

期望的输出：

May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780852: ndtpq.c(20544): this is the log  
May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780853: ndtpq.c(20544): this is another log  
May 14 00:00:05  SERVER1 ntp[1006]:  ntpd[Info]: 1430748801.790852: ndtpq.c(20544): thisis really different log

我可以使用以下命令找到uniq日志，但是，我正在使用这种方式丢失时间戳。

cat fileName |awk -F: '{print $7}'

Answer 1

这可能会：

awk -F: '!seen[$NF]++' file
May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780852: ndtpq.c(20544): this is the log
May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780853: ndtpq.c(20544): this is another log
May 14 00:00:05  SERVER1 ntp[1006]:  ntpd[Info]: 1430748801.790852: ndtpq.c(20544): thisis really different log

它使用:拆分文件，然后查看最后一个字段，并仅打印唯一字段。

Answer 2

试试这个

<强> awk中

 awk -F: '!x[$NF]++' infile

如果订单无关紧要，

GNU排序

 sort -u -t: -k7 infile

列出基于＆＃34;的uniq行：＆＃34;分隔符

例如：

期望的输出：

2 个答案: